Arlolra has uploaded a new change for review.
https://gerrit.wikimedia.org/r/284830
Change subject: Consider stx change as modified wrapper
......................................................................
Consider stx change as modified wrapper
* DOM diff'ing already doesn't ignore date-parsoid changes so this
extra check seems superfluous. I guess the question is why is stx so
special? Changes to data-mw (which are arguably just as semantic as
stx) are only consider wrapper modification. Once we migrate stx to
data-mw, would we we special case it there too? I think we'd just
end up doing this.
* One of the "Illegal character references (T106578)" selser changes is
an improvement, where we can now reuse original src and selser is
more accurate than wt2wt. The other is perhaps a case where the old
algorithm was preferable. A dt is deleted and replaced with a dd.
Previously, the stx==row allowed the differ to mark the new dd as
such and then reuse old src from the old one. But, regardless, both
serialize as broken wikitext since the row syntax only exists for dt.
* The "HTML nested bullet list, closed tags (bug 5497)" fix is probably
a negligible change, where previously selser was reusing a separator
that it now determines not to use, matching wt2wt.
Change-Id: I35a81946e58607646f0735264c324712cae3eb70
---
M lib/html2wt/DOMDiff.js
M tests/parserTests-blacklist.js
2 files changed, 3 insertions(+), 3 deletions(-)
git pull ssh://gerrit.wikimedia.org:29418/mediawiki/services/parsoid
refs/changes/30/284830/1
diff --git a/lib/html2wt/DOMDiff.js b/lib/html2wt/DOMDiff.js
index 358ca9c..c609dc1 100644
--- a/lib/html2wt/DOMDiff.js
+++ b/lib/html2wt/DOMDiff.js
@@ -302,8 +302,7 @@
if (!DU.isElt(savedNewNode)) {
this.debug("--found diff: modified
text/comment--");
this.markNode(savedNewNode, 'deleted',
DU.isBlockNodeWithVisibleWT(baseNode));
- } else if (savedNewNode.nodeName ===
baseNode.nodeName &&
- DU.getDataParsoid(savedNewNode).stx ===
DU.getDataParsoid(baseNode).stx) {
+ } else if (savedNewNode.nodeName ===
baseNode.nodeName) {
// Identical wrapper-type, but modified.
// Mark modified-wrapper, and recurse.
this.debug("--found diff:
modified-wrapper--");
diff --git a/tests/parserTests-blacklist.js b/tests/parserTests-blacklist.js
index d9d8304..fb9b48a 100644
--- a/tests/parserTests-blacklist.js
+++ b/tests/parserTests-blacklist.js
@@ -1884,7 +1884,6 @@
add("selser", "Sanitizer: Validating that <meta> and <link> work, but only for
Microdata [[3,0,4]]", "<div itemscope>\n<meta itemprop=\"hello\"
content=\"world\">\n\t<meta http-equiv=\"refresh\" content=\"5\">\n\t<meta
itemprop=\"hello\" http-equiv=\"refresh\" content=\"5\">\n\t<link
itemprop=\"hello\" href=\"{{SERVER}}\">\n\t<link rel=\"stylesheet\"
href=\"{{SERVER}}\">\n\t<link rel=\"stylesheet\" itemprop=\"hello\"
href=\"{{SERVER}}\">\naqpwsr7leatyy14i</div>");
add("selser", "Sanitizer: Validating that <meta> and <link> work, but only for
Microdata [[0,2,0]]", "<div itemscope>\n\tq21hg5f5s2kxzuxr\n<meta
itemprop=\"hello\" content=\"world\">\n\t<meta http-equiv=\"refresh\"
content=\"5\">\n\t<meta itemprop=\"hello\" http-equiv=\"refresh\"
content=\"5\">\n\t<link itemprop=\"hello\" href=\"{{SERVER}}\">\n\t<link
rel=\"stylesheet\" href=\"{{SERVER}}\">\n\t<link rel=\"stylesheet\"
itemprop=\"hello\" href=\"{{SERVER}}\">\n</div>");
add("selser", "HTML bullet list, closed tags (bug 5497) [[0,0,4,[2],3]]",
"<ul>\n<li>One</li><li>ab23ovep569io1or</li><li>02xr3ap2i2euq5miTwo</li>\n</ul>");
-add("selser", "HTML nested bullet list, closed tags (bug 5497)
[[3,3,4,[0,1,4],3]]", "<ul><li>wabcsvv43khb0529</li><li>Two:\n<ul
data-foobar=\"eza5cfmfpbucv7vi\">\n<li>Sub-one</li>\n<li>Sub-two</li>\n</ul>veusxfr3k57b9</li>\n</ul>");
add("selser", "HTML nested bullet list, open tags (bug 5497)
[[4,[3],2,[0,4],3]]",
"<ul><li>x7e4pzqkzu2fbt9</li><li><li>zz9wcy7ovlipy14i</li>\n<li>Two:\nz6aoj185m56ogvi\n\n</ul>");
add("selser", "HTML nested ordered list, closed tags (bug 5497)
[[2,1,4,[0,4,3],3]]", "<ol><li>xkvf7lec17wg66r</li>\n<li
data-foobar=\"lsd14sl7dm927qfr\">One</li><li>czwfizqug7vz33di</li><li>Two:\nu30q3qpwbmswz5mi\n</li>\n</ol>");
add("selser", "Fuzz testing: Parser13 [2]", "whfwe630sy8pvi\n{| \n|
http://a|");
@@ -1968,9 +1967,11 @@
add("selser", "Illegal character references (T106578) [1]", "; Null: �\n;
FF: \n; CR: 
\n; Control (low): \n; Control (high): 
Ÿ\n; Surrogate: ��\n; This is an okay astral character:
💩");
add("selser", "Illegal character references (T106578)
[[3,2,0,2,[4,0],0,0,0,0,0,3,0,[4],[0,0,4,0],0,[2],4,0,0,0]]", ":
u5vwcy0j6qg2e29: �\n: p9a57ap9zyyzaor\n; FF:wjnckgkar860qkt9\n; CR:

\n; Control (low)\n;5myazuveszbvgqfr:
573ocw4dmb73nmiŸ\n;w6mwa1rcleeqxgvi Surrogate\n: l24l65jpn4i5p14i\n;
This is an okay astral character: 💩");
add("selser", "Illegal character references (T106578)
[[0,0,2,4,1,2,[3],0,2,4,[3,0],0,[4],[4,0,3,0],4,[2],2,0,[4],1]]", "; Null:
�\n: 3ab3bonbkmfe0zfr\n: 99du3i9inwvr6bt9: \n: pzx5yp63uhctmx6r\n;:

\n: azy56mdc8dyxecdi\n:
80ps2m246s603sor:\n;3ly1p5adoomkj4i:pkq8glqmfkyv6lxrŸ\n:
ei515x4atmb49529\n;x59796120h5xw29 Surrogate\n: lvy96v3olh0py14i:
��\n;bb6lntpzr8me7b9: 💩");
+add("selser", "Illegal character references (T106578)
[[[4],1,0,4,1,0,[2],1,4,3,3,0,2,2,0,0,1,4,0,2]]", ";67ixqan4c0n97ldi: �\n:
edptktnj6ud0wwmi: \n;nr9hj651spqlg14i CR: \n: ki5n6qfwl0yn9udi\n:
1jdjghqs3bn2vs4i\n; Control (high)\n: 5jcuh79u3xm0lik9:  Ÿ\n;
Surrogate: ��\n: kvapfn63i0um1jor\n; This is an okay astral
character\n: sys5rwbyregeqaor: 💩");
add("selser", "Illegal character references (T106578)
[[4,0,4,1,0,0,2,0,2,4,3,0,0,2,3,0,0,0,4,[4,0]]]", ": h13sy5yortd6xbt9: �\n:
yx2a0yvb9q5v1jor\n; FF: \n: vo96hki38vy22o6r\n; CR: 
\n:
b42u05rrxpmvlsor\n: kr9xfl64a0dx6r\n; Control (high)\n: 75hojzjf9rqtcsor:
 Ÿ\n; Surrogate: ��\n:
mbh8hsmaeswnrk9:ld3mddnlg6gpsyvi💩");
add("selser", "Illegal character references (T106578)
[[1,0,0,2,[4,0],0,0,2,0,2,2,0,2,4,0,3,[2,0,0],2,2,3]]", "; Null: �\n:
wjv85ity8kw3ik9\n; FF:t0wgde3jzwp3z0k9\n; CR\n: 3ndxxwbs6qg2e29: 
\n:
1lkqfuvhd32i6bt9\n; Control (low)\n: eeavyurhiuex9a4i: \n:
gilnw0h204pfogvi\n; Control (high)\n: xhje4pcgmyzj8aor:5ix003l58rxjemi
��\n: 2sbnezeb93k57b9\n: rdmka6d0pyqr529\n; This is an okay
astral character");
add("selser", "Illegal character references (T106578)
[[3,3,4,2,[3,0],0,3,2,2,0,1,0,0,2,0,0,2,0,1,3]]", ": raewhcui99cqh0k9\n:
ejpivaf27u6av2t9\n; FF:\n: 6c3qzhgnawtgldi: 
\n: ixbuhech1qsbrzfr\n;
Control (low): \n; Control (high)\n: mk8de6y7kjf6flxr:  Ÿ\n;
Surrogate\n: xglagbmtyma8xgvi: ��\n; This is an okay astral
character");
+add("selser", "Illegal character references (T106578)
[[2,3,0,[3],4,0,[3],[3,0],4,0,[2,0],0,0,1,0,4,1,0,[3],0]]", ":
zzrwpg9zncwpzaor\n; Null\n;\n: 1duwsiak28ci8uxr\n;:
\n: c6qxuzciqh93sor\n;
Control (low):912209h0596vquxr \n; Control (high):  Ÿ\n:
aegvfq1uih82rzfr: ��\n;: 💩");
add("selser", "Illegal character references (T106578)
[[2,0,3,3,[4,0],3,2,[2,0],0,2,[2,0],0,2,[2,0,3,0],0,[2],0,0,[3],0]]", ":
2p3xm2g6vb73nmi\n; Null: �:120oei58fnfusor\n: ttrk3ah2vl6m9529\n;
CR:hr9r43lw5va38fr 
\n: b6c1nojw0074aemi\n; Control (low):iy6o5agtesy30udi
\n: h2wiauy82ajj1yvi\n; Control (high):gux535tra7ix80k9
Ÿ\n;svxwgnovvvxwdn29 Surrogate: ��\n;: 💩");
add("selser", "Illegal character references (T106578)
[[0,4,2,4,0,3,2,0,4,2,[3,0],0,4,[2,0,0,0],0,2,2,4,0,4]]", "; Null\n:
9fwhioeeke0442t9\n: j9rvy22kl04quxr\n: b9r3w66fxy8k6gvi: \n:
y1yftc25vffg8pvi\n; CR: 
\n: 2qqnhtc01zjpds4i\n: h4mtw9i1zzq1714i\n;
Control (low):\n: pbxoidwcwzlac3di:xdk2vbc7v3rf6r  Ÿ\n:
ltyipg2p1w9izfr\n; Surrogate\n: 53dgbaz96yeyu8fr: ��\n:
w4u00ww32t7kqpvi\n; This is an okay astral character\n: ybr70brqekzw7b9");
add("selser", "Illegal character references (T106578)
[[[2],0,0,[4],[2,0],0,[2],0,0,1,4,0,[3],0,0,2,4,4,0,3]]", ";zu5495hxxql1sjor
Null: �\n;5cbqazxlqwbl0udi:pof7ppk4wzmpldi \n;xtwdv922j0xusor CR:

\n; Control (low)\n: 7mds6rq9nzwoecdi\n;:  Ÿ\n:
9k3u42yb3sl6usor\n; Surrogate\n: qb704xdxe3aqbyb9\n: phjue90joss2lnmi\n; This
is an okay astral character");
--
To view, visit https://gerrit.wikimedia.org/r/284830
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I35a81946e58607646f0735264c324712cae3eb70
Gerrit-PatchSet: 1
Gerrit-Project: mediawiki/services/parsoid
Gerrit-Branch: master
Gerrit-Owner: Arlolra <[email protected]>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits