Subramanya Sastry has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/87632


Change subject: (Bug 54946) Fix unhandle <pre> tokenizing scenarios
......................................................................

(Bug 54946) Fix unhandle <pre> tokenizing scenarios

* inline_element parser production wouldn't recognize <pre>.
  The <pre> parser production expected content between <pre>.
  This patch fixes both these gaps.

* Added new parser test. The test would fail without this patch.

* Verified that parser output for en:X_Window_System is fixed.

Change-Id: Ic32b1a90a927b567bf6914b81d7aa621b9df7959
---
M js/lib/pegTokenizer.pegjs.txt
M js/tests/parserTests.txt
2 files changed, 27 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/mediawiki/extensions/Parsoid 
refs/changes/32/87632/1

diff --git a/js/lib/pegTokenizer.pegjs.txt b/js/lib/pegTokenizer.pegjs.txt
index 983a625..433dc96 100644
--- a/js/lib/pegTokenizer.pegjs.txt
+++ b/js/lib/pegTokenizer.pegjs.txt
@@ -693,9 +693,7 @@
 
 inline_element
   = //& { dp('inline_element enter' + input.substr(pos, 10)); return true; }
-      & '<' nowiki
-    / & '<' xmlish_tag
-    / & '<' comment
+    & '<' ( pre / comment / nowiki / xmlish_tag )
     /// & '{' ( & '{{{{{' template / tplarg / template )
     / & '{' tplarg_or_template_or_broken
     / & '}' broken_template
@@ -1381,12 +1379,13 @@
     "<" pre_tag_name
     attribs:generic_attribute*
     endpos:(">" { return pos; })
-    // MediaWiki <pre> is special in that it converts all pre content to plain
-    // text.
-    ts:(    newlineToken
-                / (htmlentity / [^&<]+)+
-                / nowiki
-                / !("</" pre_tag_name ">") t2:(htmlentity / .) { return t2; })+
+    // MediaWiki <pre> is special in that it converts all pre content to plain 
text.
+    ts:(
+          newlineToken
+        / (htmlentity / [^&<]+)+
+        / nowiki
+        / !("</" pre_tag_name ">") t2:(htmlentity / .) { return t2; }
+    )*
     ("</" pre_tag_name ">" / eof) {
         stops.dec('pre');
         // return nowiki tags as well?
diff --git a/js/tests/parserTests.txt b/js/tests/parserTests.txt
index 79e91fb..9ce88e8 100644
--- a/js/tests/parserTests.txt
+++ b/js/tests/parserTests.txt
@@ -1550,6 +1550,24 @@
 </p>
 !! end
 
+!! test
+Empty pre; pre inside other HTML tags (bug 54946)
+!! input
+a
+<div><pre>
+foo
+</pre></div>
+<pre></pre>
+!! result
+<p>a
+</p>
+<div><pre>
+foo
+</pre></div>
+<pre></pre>
+
+!! end
+
 !!test
 Templates: Indent-Pre: 1a. Templates that break a line should suppress <pre>
 !!input
@@ -17519,7 +17537,7 @@
 !!input
 plain text</pre>
 !!result
-plain text
+<p>plain text&lt;/pre&gt;</p>
 !!end
 
 !!test

-- 
To view, visit https://gerrit.wikimedia.org/r/87632
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic32b1a90a927b567bf6914b81d7aa621b9df7959
Gerrit-PatchSet: 1
Gerrit-Project: mediawiki/extensions/Parsoid
Gerrit-Branch: master
Gerrit-Owner: Subramanya Sastry <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to