jenkins-bot has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/389531 )

Change subject: T171381: Linter: Handle optional end tags in missing-end-tag 
category
......................................................................


T171381: Linter: Handle optional end tags in missing-end-tag category

See https://www.w3.org/TR/html5/syntax.html#optional-tags

There are a bunch of other contexts in which other end tags (like dt,
dd, thead, tbody, p, rb, etc.) are optional, but that requires
contextual DOM analysis and I haven't reasoned through the spec and the
code that is required to handle them properly. Since these other tags
are uncommon in HTML form, I am skipping that detail at this time.
Those additional sources of false positives can be addressed in
followup patches on demand based on how commonly these reports
show up in linter output.

Change-Id: I4455fa26a969fc8dea56273259140d13e9995236
---
M lib/wt2html/pp/handlers/linter.js
M tests/mocha/linter.js
2 files changed, 25 insertions(+), 1 deletion(-)

Approvals:
  jenkins-bot: Verified
  Arlolra: Looks good to me, approved



diff --git a/lib/wt2html/pp/handlers/linter.js 
b/lib/wt2html/pp/handlers/linter.js
index afbb732..aa20ff9 100644
--- a/lib/wt2html/pp/handlers/linter.js
+++ b/lib/wt2html/pp/handlers/linter.js
@@ -222,6 +222,27 @@
                DU.getDataParsoid(contentNode).name === name);
 }
 
+function endTagOptional(node) {
+       // See https://www.w3.org/TR/html5/syntax.html#optional-tags
+       //
+       // End tags for tr/td/th/li are entirely optional since they
+       // require a parent container and can only be followed by like
+       // kind.
+       //
+       // Caveat: <li>foo</li><ol>..</ol> and <li>foo<ol>..</ol>
+       // generate different DOM trees, so explicit </li> tag
+       // is required to specify which of the two was intended.
+       //
+       // With that one caveat around nesting, the parse with/without
+       // the end tag is identical. For now, ignoring that caveat
+       // since they aren't like to show up in our corpus much.
+       //
+       // For the other tags in that w3c spec section, I haven't reasoned
+       // through when exactly they are optional. Not handling that complexity
+       // for now since those are likely uncommon use cases in our corpus.
+       return /^(TR|TD|TH|LI)$/.test(node.nodeName);
+}
+
 /*
  * Log Treebuilder fixups marked by dom.markTreeBuilderFixup.js
  * It handles the following scenarios:
@@ -319,7 +340,7 @@
                                        }
                                        adjDp.tmp.linted = true;
                                        env.log('lint/misnested-tag', lintObj);
-                               } else if (DU.hasLiteralHTMLMarker(dp)) {
+                               } else if (DU.hasLiteralHTMLMarker(dp) && 
!endTagOptional(c)) {
                                        env.log('lint/missing-end-tag', 
lintObj);
                                }
                        }
diff --git a/tests/mocha/linter.js b/tests/mocha/linter.js
index 3eea822..ab8bb5b 100644
--- a/tests/mocha/linter.js
+++ b/tests/mocha/linter.js
@@ -67,6 +67,9 @@
                                result[0].params.should.have.a.property("name", 
"p");
                        });
                });
+               it('should not flag tags where end tags are optional in the 
spec', function() {
+                       return 
expectEmptyResults('<ul><li>x<li>y</ul><table><tr><th>heading 1<tr><td>col 
1<td>col 2</table>');
+               });
        });
 
        describe('STRIPPED TAGS', function() {

-- 
To view, visit https://gerrit.wikimedia.org/r/389531
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I4455fa26a969fc8dea56273259140d13e9995236
Gerrit-PatchSet: 6
Gerrit-Project: mediawiki/services/parsoid
Gerrit-Branch: master
Gerrit-Owner: Subramanya Sastry <ssas...@wikimedia.org>
Gerrit-Reviewer: Arlolra <abrea...@wikimedia.org>
Gerrit-Reviewer: C. Scott Ananian <canan...@wikimedia.org>
Gerrit-Reviewer: Legoktm <lego...@member.fsf.org>
Gerrit-Reviewer: Sbailey <sbai...@wikimedia.org>
Gerrit-Reviewer: Subramanya Sastry <ssas...@wikimedia.org>
Gerrit-Reviewer: jenkins-bot <>

_______________________________________________
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to