Subramanya Sastry has uploaded a new change for review.
https://gerrit.wikimedia.org/r/50368
Change subject: Serialize figure-captions in non-sol state to prevent RT errors.
......................................................................
Serialize figure-captions in non-sol state to prevent RT errors.
* Figure captions are not in a Start-Of-Line (SOL) in wikitext,
so they have to serialized in non-sol state as well. Without
this, captions that have wikitext-like-chars in the sol-position
will get nowiki-escaped. For example,
[[File:foo.jpg|thumb| bar]] will get serialized to:
[[File:foo.jpg|thumb|<nowiki> bar</nowiki>]]
* This is a bug that crept in with a switch from token-based
serialization handlers to DOM-based handlers in 8939c69. But, the
bug was hidden because of two reasons. We didn't have parser tests
to cover this scenario. But, wikitext above wasn't getting parsed
to figure-tags because of bugs from 6e4ef20e that tackled i18n.
This i18n bug was fixed in 54695ae3 which exposed the figure
serialization bug in RT-testing.
* This patch fixes the original bug.
* No change in parser test results. TODO: Add new ones.
Change-Id: Iaa111628dd3d30118d85837a138fe10ff3d2e325
---
M js/lib/mediawiki.WikitextSerializer.js
1 file changed, 16 insertions(+), 5 deletions(-)
git pull ssh://gerrit.wikimedia.org:29418/mediawiki/extensions/Parsoid
refs/changes/68/50368/1
diff --git a/js/lib/mediawiki.WikitextSerializer.js
b/js/lib/mediawiki.WikitextSerializer.js
index 6a38dee..5b19afb 100644
--- a/js/lib/mediawiki.WikitextSerializer.js
+++ b/js/lib/mediawiki.WikitextSerializer.js
@@ -519,7 +519,7 @@
return escapedText(text);
}
- var sol = state.onStartOfLine || state.emitNewlineOnNextToken,
+ var sol = state.onStartOfLine,
hasNewlines = text.match(/\n./),
hasTildes = text.match(/~{3,5}/);
if (!fullCheckNeeded && !hasNewlines && !hasTildes) {
@@ -730,10 +730,22 @@
return cb('');
}
+ // Captions dont start on a new line
+ //
+ // So, even though the figure might be in a sol-state, serialize the
+ // caption in a no-sol state and restore old state. This is required
+ // to prevent spurious wikitext escaping for this example:
+ //
+ // [[File:foo.jpg|thumb| bar]] ==> [[File:foo.jpg|thumb|<nowiki>
bar</nowiki>]]
+ //
+ // In sol state, text " bar" should be nowiki escaped to prevent it from
+ // parsing to an indent-pre. But, not in figure captions.
+ var captionSrc, oldSOLState = state.onStartOfLine;
+ state.onStartOfLine = false;
+ captionSrc = state.serializeChildrenToString(caption.childNodes,
WSP.wteHandlers.aHandler);
+ state.onStartOfLine = oldSOLState;
- var captionSrc = state.serializeChildrenToString(caption.childNodes,
-
WSP.wteHandlers.aHandler),
- imgResource = (img && img.getAttribute('resource') ||
'').replace(/(^\[:)|(\]$)/g, ''),
+ var imgResource = (img && img.getAttribute('resource') ||
'').replace(/(^\[:)|(\]$)/g, ''),
outBits = [imgResource],
figAttrs = dp.optionList,
optNames = dp.optNames,
@@ -940,7 +952,6 @@
// When processing link text, we are no longer in newline state
// since that will be preceded by "[[" or "[" text in target wikitext.
state.onStartOfLine = false;
- state.emitNewlineOnNextToken = false;
state.wteHandlerStack.push(WSP.wteHandlers.wikilinkHandler);
var res = WSP.escapeWikiText(state, contentString);
state.wteHandlerStack.pop();
--
To view, visit https://gerrit.wikimedia.org/r/50368
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iaa111628dd3d30118d85837a138fe10ff3d2e325
Gerrit-PatchSet: 1
Gerrit-Project: mediawiki/extensions/Parsoid
Gerrit-Branch: master
Gerrit-Owner: Subramanya Sastry <[email protected]>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits