Arlolra has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/284830

Change subject: Consider stx change as modified wrapper
......................................................................

Consider stx change as modified wrapper

 * DOM diff'ing already doesn't ignore date-parsoid changes so this
   extra check seems superfluous.  I guess the question is why is stx so
   special? Changes to data-mw (which are arguably just as semantic as
   stx) are only consider wrapper modification.  Once we migrate stx to
   data-mw, would we we special case it there too?  I think we'd just
   end up doing this.

 * One of the "Illegal character references (T106578)" selser changes is
   an improvement, where we can now reuse original src and selser is
   more accurate than wt2wt.  The other is perhaps a case where the old
   algorithm was preferable.  A dt is deleted and replaced with a dd.
   Previously, the stx==row allowed the differ to mark the new dd as
   such and then reuse old src from the old one.  But, regardless, both
   serialize as broken wikitext since the row syntax only exists for dt.

 * The "HTML nested bullet list, closed tags (bug 5497)" fix is probably
   a negligible change, where previously selser was reusing a separator
   that it now determines not to use, matching wt2wt.

Change-Id: I35a81946e58607646f0735264c324712cae3eb70
---
M lib/html2wt/DOMDiff.js
M tests/parserTests-blacklist.js
2 files changed, 3 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/mediawiki/services/parsoid 
refs/changes/30/284830/1

diff --git a/lib/html2wt/DOMDiff.js b/lib/html2wt/DOMDiff.js
index 358ca9c..c609dc1 100644
--- a/lib/html2wt/DOMDiff.js
+++ b/lib/html2wt/DOMDiff.js
@@ -302,8 +302,7 @@
                                if (!DU.isElt(savedNewNode)) {
                                        this.debug("--found diff: modified 
text/comment--");
                                        this.markNode(savedNewNode, 'deleted', 
DU.isBlockNodeWithVisibleWT(baseNode));
-                               } else if (savedNewNode.nodeName === 
baseNode.nodeName &&
-                                       DU.getDataParsoid(savedNewNode).stx === 
DU.getDataParsoid(baseNode).stx) {
+                               } else if (savedNewNode.nodeName === 
baseNode.nodeName) {
                                        // Identical wrapper-type, but modified.
                                        // Mark modified-wrapper, and recurse.
                                        this.debug("--found diff: 
modified-wrapper--");
diff --git a/tests/parserTests-blacklist.js b/tests/parserTests-blacklist.js
index d9d8304..fb9b48a 100644
--- a/tests/parserTests-blacklist.js
+++ b/tests/parserTests-blacklist.js
@@ -1884,7 +1884,6 @@
 add("selser", "Sanitizer: Validating that <meta> and <link> work, but only for 
Microdata [[3,0,4]]", "<div itemscope>\n<meta itemprop=\"hello\" 
content=\"world\">\n\t<meta http-equiv=\"refresh\" content=\"5\">\n\t<meta 
itemprop=\"hello\" http-equiv=\"refresh\" content=\"5\">\n\t<link 
itemprop=\"hello\" href=\"{{SERVER}}\">\n\t<link rel=\"stylesheet\" 
href=\"{{SERVER}}\">\n\t<link rel=\"stylesheet\" itemprop=\"hello\" 
href=\"{{SERVER}}\">\naqpwsr7leatyy14i</div>");
 add("selser", "Sanitizer: Validating that <meta> and <link> work, but only for 
Microdata [[0,2,0]]", "<div itemscope>\n\tq21hg5f5s2kxzuxr\n<meta 
itemprop=\"hello\" content=\"world\">\n\t<meta http-equiv=\"refresh\" 
content=\"5\">\n\t<meta itemprop=\"hello\" http-equiv=\"refresh\" 
content=\"5\">\n\t<link itemprop=\"hello\" href=\"{{SERVER}}\">\n\t<link 
rel=\"stylesheet\" href=\"{{SERVER}}\">\n\t<link rel=\"stylesheet\" 
itemprop=\"hello\" href=\"{{SERVER}}\">\n</div>");
 add("selser", "HTML bullet list, closed tags (bug 5497) [[0,0,4,[2],3]]", 
"<ul>\n<li>One</li><li>ab23ovep569io1or</li><li>02xr3ap2i2euq5miTwo</li>\n</ul>");
-add("selser", "HTML nested bullet list, closed tags (bug 5497) 
[[3,3,4,[0,1,4],3]]", "<ul><li>wabcsvv43khb0529</li><li>Two:\n<ul 
data-foobar=\"eza5cfmfpbucv7vi\">\n<li>Sub-one</li>\n<li>Sub-two</li>\n</ul>veusxfr3k57b9</li>\n</ul>");
 add("selser", "HTML nested bullet list, open tags (bug 5497) 
[[4,[3],2,[0,4],3]]", 
"<ul><li>x7e4pzqkzu2fbt9</li><li><li>zz9wcy7ovlipy14i</li>\n<li>Two:\nz6aoj185m56ogvi\n\n</ul>");
 add("selser", "HTML nested ordered list, closed tags (bug 5497) 
[[2,1,4,[0,4,3],3]]", "<ol><li>xkvf7lec17wg66r</li>\n<li 
data-foobar=\"lsd14sl7dm927qfr\">One</li><li>czwfizqug7vz33di</li><li>Two:\nu30q3qpwbmswz5mi\n</li>\n</ol>");
 add("selser", "Fuzz testing: Parser13 [2]", "whfwe630sy8pvi\n{| \n| 
http://a|");
@@ -1968,9 +1967,11 @@
 add("selser", "Illegal character references (T106578) [1]", "; Null: &#00;\n; 
FF: &#xC;\n; CR: &#xD;\n; Control (low): &#8;\n; Control (high): &#x7F; 
&#x9F;\n; Surrogate: &#xD83D;&#xDCA9;\n; This is an okay astral character: 
&#x1F4A9;");
 add("selser", "Illegal character references (T106578) 
[[3,2,0,2,[4,0],0,0,0,0,0,3,0,[4],[0,0,4,0],0,[2],4,0,0,0]]", ": 
u5vwcy0j6qg2e29: &#00;\n: p9a57ap9zyyzaor\n; FF:wjnckgkar860qkt9&#xC;\n; CR: 
&#xD;\n; Control (low)\n;5myazuveszbvgqfr: 
&#x7F;573ocw4dmb73nmi&#x9F;\n;w6mwa1rcleeqxgvi Surrogate\n: l24l65jpn4i5p14i\n; 
This is an okay astral character: &#x1F4A9;");
 add("selser", "Illegal character references (T106578) 
[[0,0,2,4,1,2,[3],0,2,4,[3,0],0,[4],[4,0,3,0],4,[2],2,0,[4],1]]", "; Null: 
&#00;\n: 3ab3bonbkmfe0zfr\n: 99du3i9inwvr6bt9: &#xC;\n: pzx5yp63uhctmx6r\n;: 
&#xD;\n: azy56mdc8dyxecdi\n: 
80ps2m246s603sor:&#8;\n;3ly1p5adoomkj4i:pkq8glqmfkyv6lxr&#x7F;&#x9F;\n: 
ei515x4atmb49529\n;x59796120h5xw29 Surrogate\n: lvy96v3olh0py14i: 
&#xD83D;&#xDCA9;\n;bb6lntpzr8me7b9: &#x1F4A9;");
+add("selser", "Illegal character references (T106578) 
[[[4],1,0,4,1,0,[2],1,4,3,3,0,2,2,0,0,1,4,0,2]]", ";67ixqan4c0n97ldi: &#00;\n: 
edptktnj6ud0wwmi: \n;nr9hj651spqlg14i CR: \n: ki5n6qfwl0yn9udi\n: 
1jdjghqs3bn2vs4i\n; Control (high)\n: 5jcuh79u3xm0lik9: &#x7F; &#x9F;\n; 
Surrogate: &#xD83D;&#xDCA9;\n: kvapfn63i0um1jor\n; This is an okay astral 
character\n: sys5rwbyregeqaor: &#x1F4A9;");
 add("selser", "Illegal character references (T106578) 
[[4,0,4,1,0,0,2,0,2,4,3,0,0,2,3,0,0,0,4,[4,0]]]", ": h13sy5yortd6xbt9: &#00;\n: 
yx2a0yvb9q5v1jor\n; FF: &#xC;\n: vo96hki38vy22o6r\n; CR: &#xD;\n: 
b42u05rrxpmvlsor\n: kr9xfl64a0dx6r\n; Control (high)\n: 75hojzjf9rqtcsor: 
&#x7F; &#x9F;\n; Surrogate: &#xD83D;&#xDCA9;\n: 
mbh8hsmaeswnrk9:ld3mddnlg6gpsyvi&#x1F4A9;");
 add("selser", "Illegal character references (T106578) 
[[1,0,0,2,[4,0],0,0,2,0,2,2,0,2,4,0,3,[2,0,0],2,2,3]]", "; Null: &#00;\n: 
wjv85ity8kw3ik9\n; FF:t0wgde3jzwp3z0k9&#xC;\n; CR\n: 3ndxxwbs6qg2e29: &#xD;\n: 
1lkqfuvhd32i6bt9\n; Control (low)\n: eeavyurhiuex9a4i: &#8;\n: 
gilnw0h204pfogvi\n; Control (high)\n: xhje4pcgmyzj8aor:5ix003l58rxjemi 
&#xD83D;&#xDCA9;\n: 2sbnezeb93k57b9\n: rdmka6d0pyqr529\n; This is an okay 
astral character");
 add("selser", "Illegal character references (T106578) 
[[3,3,4,2,[3,0],0,3,2,2,0,1,0,0,2,0,0,2,0,1,3]]", ": raewhcui99cqh0k9\n: 
ejpivaf27u6av2t9\n; FF:&#xC;\n: 6c3qzhgnawtgldi: &#xD;\n: ixbuhech1qsbrzfr\n; 
Control (low): &#8;\n; Control (high)\n: mk8de6y7kjf6flxr: &#x7F; &#x9F;\n; 
Surrogate\n: xglagbmtyma8xgvi: &#xD83D;&#xDCA9;\n; This is an okay astral 
character");
+add("selser", "Illegal character references (T106578) 
[[2,3,0,[3],4,0,[3],[3,0],4,0,[2,0],0,0,1,0,4,1,0,[3],0]]", ": 
zzrwpg9zncwpzaor\n; Null\n;\n: 1duwsiak28ci8uxr\n;:&#xD;\n: c6qxuzciqh93sor\n; 
Control (low):912209h0596vquxr &#8;\n; Control (high): &#x7F; &#x9F;\n: 
aegvfq1uih82rzfr: &#xD83D;&#xDCA9;\n;: &#x1F4A9;");
 add("selser", "Illegal character references (T106578) 
[[2,0,3,3,[4,0],3,2,[2,0],0,2,[2,0],0,2,[2,0,3,0],0,[2],0,0,[3],0]]", ": 
2p3xm2g6vb73nmi\n; Null: &#00;:120oei58fnfusor&#xC;\n: ttrk3ah2vl6m9529\n; 
CR:hr9r43lw5va38fr &#xD;\n: b6c1nojw0074aemi\n; Control (low):iy6o5agtesy30udi 
&#8;\n: h2wiauy82ajj1yvi\n; Control (high):gux535tra7ix80k9 
&#x7F;&#x9F;\n;svxwgnovvvxwdn29 Surrogate: &#xD83D;&#xDCA9;\n;: &#x1F4A9;");
 add("selser", "Illegal character references (T106578) 
[[0,4,2,4,0,3,2,0,4,2,[3,0],0,4,[2,0,0,0],0,2,2,4,0,4]]", "; Null\n: 
9fwhioeeke0442t9\n: j9rvy22kl04quxr\n: b9r3w66fxy8k6gvi: &#xC;\n: 
y1yftc25vffg8pvi\n; CR: &#xD;\n: 2qqnhtc01zjpds4i\n: h4mtw9i1zzq1714i\n; 
Control (low):&#8;\n: pbxoidwcwzlac3di:xdk2vbc7v3rf6r &#x7F; &#x9F;\n: 
ltyipg2p1w9izfr\n; Surrogate\n: 53dgbaz96yeyu8fr: &#xD83D;&#xDCA9;\n: 
w4u00ww32t7kqpvi\n; This is an okay astral character\n: ybr70brqekzw7b9");
 add("selser", "Illegal character references (T106578) 
[[[2],0,0,[4],[2,0],0,[2],0,0,1,4,0,[3],0,0,2,4,4,0,3]]", ";zu5495hxxql1sjor 
Null: &#00;\n;5cbqazxlqwbl0udi:pof7ppk4wzmpldi &#xC;\n;xtwdv922j0xusor CR: 
&#xD;\n; Control (low)\n: 7mds6rq9nzwoecdi\n;: &#x7F; &#x9F;\n: 
9k3u42yb3sl6usor\n; Surrogate\n: qb704xdxe3aqbyb9\n: phjue90joss2lnmi\n; This 
is an okay astral character");

-- 
To view, visit https://gerrit.wikimedia.org/r/284830
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I35a81946e58607646f0735264c324712cae3eb70
Gerrit-PatchSet: 1
Gerrit-Project: mediawiki/services/parsoid
Gerrit-Branch: master
Gerrit-Owner: Arlolra <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to