Subramanya Sastry has uploaded a new change for review.
https://gerrit.wikimedia.org/r/180563
Change subject: Tweak <nowiki/> removal heuristic a bit
......................................................................
Tweak <nowiki/> removal heuristic a bit
* Looks like there are a lot of wikitext scenarios like this in
roundtrip testing.
'<nowiki/>''foo'' and ''[[bar]]''
Our current conservative heuristic won't strip the nowiki in that
scenario. So, add another hacky heuristic for now. We really need
a line-based heuristic that can examine wikitext chunks that were
emitted and distinguish between output chunks.
That is coming later as part of what Scott is working on.
For now, this should help us minimize regressions.
Change-Id: I2759e76d56703254d3907ac447644457bc007b4b
---
M lib/mediawiki.WikitextSerializer.js
1 file changed, 5 insertions(+), 2 deletions(-)
git pull ssh://gerrit.wikimedia.org:29418/mediawiki/services/parsoid
refs/changes/63/180563/1
diff --git a/lib/mediawiki.WikitextSerializer.js
b/lib/mediawiki.WikitextSerializer.js
index 79ba9bd..ea45bf7 100644
--- a/lib/mediawiki.WikitextSerializer.js
+++ b/lib/mediawiki.WikitextSerializer.js
@@ -1220,16 +1220,19 @@
// Within the matched quote-segments, be conservative and don't match
higher-priority
// parser characters like [{< -- used for links and templates. This
should prevent
// inadvertent matching up across links/templates/tags.
- var testRE =
/^[^']+$|^[^']*(('''''[^\[\{<']+'''''|'''[^\[\{<']+'''|''[^\[\{<']+''|')([^']+|$))+('|$)$/;
+ var testRE =
/^[^']+$|^[^']*(('''''(\[\[\w+\]\]|[^\[\{<']+)'''''|'''(\[\[\w+\]\]|[^\[\{<']+)'''|''(\[\[\w+\]\]|[^\[\{<']+)''|')([^']+|$))+('|$)$/;
return wt.split(/\n|$/).map(function(line) {
+ if (!/<nowiki\/>/.test(line)) {
+ return line;
+ }
+
// * Strip out nowiki-protected strings since we are only
interested in
// quote sequences that correspond to <i>/<b> tags.
// * Find segments separated by <nowiki/>s.
// * If all the segments contain balanced i/b tags, and the
<nowiki/>
// separated a quote and an i/b tag, we can remove all the
<nowiki/>s
var pieces = line.replace(/<nowiki>.*?<\/nowiki>/g,
'').split(/<nowiki\/>/);
-
var n = pieces.length;
for (var i = 0; i < n; i++) {
if (!testRE.test(pieces[i]) ||
--
To view, visit https://gerrit.wikimedia.org/r/180563
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2759e76d56703254d3907ac447644457bc007b4b
Gerrit-PatchSet: 1
Gerrit-Project: mediawiki/services/parsoid
Gerrit-Branch: master
Gerrit-Owner: Subramanya Sastry <[email protected]>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits