jenkins-bot has submitted this change and it was merged.
Change subject: (Bug 52824) Improved handling of '=' char in tpl-param values
......................................................................
(Bug 52824) Improved handling of '=' char in tpl-param values
* Presence of the '=' char in the wikitext value of a tpl-param
is insufficient to determine whether it will be nowiki escaped.
Ex: In {{echo|{{echo|1=foo}}}}, the first arg of the outermost
param does have the '=' char in the wt, but it doesn't need to
be nowiki escaped nevertheless (nor converted into a named
parameter).
This patch moves the decision for whether to convert a positional
param (with a '=' char in its wikitext) to a named param to the
wikitext escaping code where this decision can be made more
precisely.
This is the bug reported in 52824
* In addition, this patch also adds a heuristic to determine when
to nowiki-escape a positional param with a '=' char and when to
convert it to a named param. Because of the idiosyncracies of
positional and named params, converting a positional param to
a named param usually forces all later positional params to be
converted to named params as well which then introduced a lot of
dirty diffs, and also makes it quite painful to maintain the
transclusion wikitext.
* Added/updated existing parser tests.
NOTE: Without adding data-parsoid to parser tests, it is not
possible to test the heuristic bit of this patch. This is one
more case where info in data-parsoid is essential to test some
of Parsoid's serialization features.
Change-Id: I8d2277e4d931a934c29ec5a61985d159c2537217
---
M js/lib/mediawiki.WikitextSerializer.js
M js/tests/parserTests.txt
2 files changed, 77 insertions(+), 21 deletions(-)
Approvals:
GWicke: Looks good to me, approved
jenkins-bot: Verified
diff --git a/js/lib/mediawiki.WikitextSerializer.js
b/js/lib/mediawiki.WikitextSerializer.js
index 47edc1a..a5c9187 100644
--- a/js/lib/mediawiki.WikitextSerializer.js
+++ b/js/lib/mediawiki.WikitextSerializer.js
@@ -880,16 +880,46 @@
* width of the opening and closing wikitext tags and not
* the entire DOM range they span in the end.
*/
-WSP.escapeTplArgWT = function(state, arg, isPositional) {
+WSP.escapeTplArgWT = function(state, arg, opts) {
+ var serializeAsNamed = opts.serializeAsNamed;
+
function escapeStr(str, buf, pos) {
var bracketPairStrippedStr = str.replace(/\[([^\[\]]*)\]/g,
'_$1_'),
genericMatch = pos.start && /^\{/.test(str) ||
pos.end && /\}$/.test(str) ||
/\{\{|\}\}|[\[\]\|]/.test(bracketPairStrippedStr);
- if (genericMatch ||
- // Can't allow '=' in positional parameters
- (isPositional && /[=]/.test(str)))
- {
+
+ // '=' is not allowed in positional parameters. We can either
+ // nowiki escape it or convert the named parameter into a
+ // positional param to avoid the escaping.
+ if (opts.isTemplate && !serializeAsNamed && /[=]/.test(str)) {
+ // In certain situations, it is better to add a nowiki
escape
+ // rather than convert this to a named param.
+ //
+ // Ex: Consider: {{funky-tpl|a|b|c|d|e|f|g|h}}
+ //
+ // If an editor changes 'a' to 'f=oo' and we convert it
to
+ // a named param 1=f=oo, we are effectively converting
all
+ // the later params into named params as well and we get
+ // {{funky-tpl|1=f=oo|2=b|3=c|...|8=h}} instead of
+ // {{funky-tpl|<nowiki>f=oo</nowiki>|b|c|...|h}}
+ //
+ // The latter is better in this case. This is a real
problem
+ // in production.
+ //
+ // For now, we use a simple heuristic below and can be
+ // refined later, if necessary
+ //
+ // 1. Either there were no original positional args
+ // 2. Or, only the last positional arg uses '='
+ if (opts.numPositionalArgs === 0 ||
+ opts.numPositionalArgs === opts.argIndex)
+ {
+ serializeAsNamed = true;
+ }
+ }
+
+ if (genericMatch || (!serializeAsNamed && /[=]/.test(str))) {
buf.push("<nowiki>");
buf.push(str);
buf.push("</nowiki>");
@@ -973,7 +1003,7 @@
}
}
- return buf.join('');
+ return { serializeAsNamed: serializeAsNamed, v: buf.join('') };
};
/**
@@ -2946,6 +2976,17 @@
};
WSP._buildTemplateWT = function(node, state, srcParts) {
+ function countPositionalArgs(tpl, paramInfos) {
+ var res = 0;
+ paramInfos.forEach(function(paramInfo) {
+ var k = paramInfo.k;
+ if (tpl.params[k] !== undefined && !paramInfo.named) {
+ res++;
+ }
+ });
+ return res;
+ }
+
var buf = [],
serializer = this,
dp = node.data.parsoid;
@@ -2978,6 +3019,7 @@
n = keys.length;
if (n > 0) {
var numericIndex = 1,
+ numPositionalArgs =
countPositionalArgs(tpl, paramInfos),
pushArg = function (k, paramInfo) {
if (!paramInfo) {
paramInfo = {};
@@ -2987,17 +3029,15 @@
// Default to ' = '
spacing. Anything that matches
// this does not
remember spc explicitly.
spc = ['', ' ', ' ',
''],
- serializeAsNamed =
false;
+ opts = {
+
serializeAsNamed: false,
+ isTemplate:
isTpl,
+
numPositionalArgs: numPositionalArgs,
+ argIndex:
numericIndex
+ };
- // NOTE: Escaping the "=" char
in the regexp because JSHint
- // complains that it can be
confused by other developers.
- //
http://jslinterrors.com/a-regular-expression-literal-can-be-confused-with/
-
- // Use named serialization if
the value contains a '='
- if (paramInfo.named || k !==
numericIndex.toString() ||
- isTpl && (/\=/).test(v))
- {
- serializeAsNamed = true;
+ if (paramInfo.named || k !==
numericIndex.toString()) {
+ opts.serializeAsNamed =
true;
}
if (paramInfo.spc) {
@@ -3006,17 +3046,16 @@
// TODO: match the
space style of other/ parameters!
//spc = ['', ' ', ' ',
''];
//}
-
- if (serializeAsNamed) {
+ var res =
serializer.escapeTplArgWT(state, v, opts);
+ if (res.serializeAsNamed) {
// Escape as value only
// Trim WS
- v =
serializer.escapeTplArgWT(state, v.trim(), false);
- argBuf.push(spc[0] + k
+ spc[1] + "=" + spc[2] + v + spc[3]);
+ argBuf.push(spc[0] + k
+ spc[1] + "=" + spc[2] + res.v.trim() + spc[3]);
} else {
numericIndex++;
// Escape as positional
parameter
// No WS trimming
-
argBuf.push(serializer.escapeTplArgWT(state, v, true));
+ argBuf.push(res.v);
}
};
diff --git a/js/tests/parserTests.txt b/js/tests/parserTests.txt
index 7f33f47..eac618e 100644
--- a/js/tests/parserTests.txt
+++ b/js/tests/parserTests.txt
@@ -1766,6 +1766,17 @@
</p>
!! end
+## Bug 52824
+!! test
+Templates: '=' char in nested transclusions should not trigger nowiki escapes
or conversion to named param
+!! options
+parsoid=html2wt,wt2wt
+!! input
+{{echo|{{echo|1=bar}}}}
+!! result
+<p about="#mwt1" typeof="mw:Transclusion"
data-mw='{"target":{"wt":"echo","href":"./Template:Echo"},"params":{"1":{"wt":"{{echo|1=bar}}"}},"i":0}'>bar</p>
+!! end
+
###
### Parsoid-centric tests for testing RT edge cases for pre
###
@@ -17349,10 +17360,16 @@
parsoid=html2wt
!! input
{{echo|1 = f=oo}}
+
+{{echo|1 = f=oo|2 = bar}}
!! result
<p about="#mwt1" typeof="mw:Transclusion"
data-mw='{"target":{"wt":"echo","href":"./Template:Echo"},"params":{"1":{"wt":"f=oo"}},"i":0}'
>foo</p>
+
+<p about="#mwt1" typeof="mw:Transclusion"
+data-mw='{"target":{"wt":"echo","href":"./Template:Echo"},"params":{"1":{"wt":"f=oo"},
"2":{"wt":"bar"}},"i":0}'
+>foo</p>
!! end
# -----------------------------------------------------------------
--
To view, visit https://gerrit.wikimedia.org/r/79836
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I8d2277e4d931a934c29ec5a61985d159c2537217
Gerrit-PatchSet: 2
Gerrit-Project: mediawiki/extensions/Parsoid
Gerrit-Branch: master
Gerrit-Owner: Subramanya Sastry <[email protected]>
Gerrit-Reviewer: GWicke <[email protected]>
Gerrit-Reviewer: Subramanya Sastry <[email protected]>
Gerrit-Reviewer: jenkins-bot
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits