[MediaWiki-commits] [Gerrit] mediawiki/core[master]: Protect -{...}- variant constructs in definition lists.
jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/327112 ) Change subject: Protect -{...}- variant constructs in definition lists. .. Protect -{...}- variant constructs in definition lists. Given the wikitext: ;-{zh-cn:AAA;zh-tw:BBB}- Prevent `doBlockLevels` from trying to split the definition list at the embedded colon and using `AAA;zh-tw:BBB}-` as the `` portion. Bug: T153135 Change-Id: I3a4d02f1fbd0d0fe8278d6b7c66005f0dd3dd36b --- M includes/parser/BlockLevelPass.php M tests/parser/parserTests.txt 2 files changed, 89 insertions(+), 34 deletions(-) Approvals: Tim Starling: Looks good to me, approved C. Scott Ananian: Looks good to me, approved jenkins-bot: Verified diff --git a/includes/parser/BlockLevelPass.php b/includes/parser/BlockLevelPass.php index cbacd34..e16cfd4 100644 --- a/includes/parser/BlockLevelPass.php +++ b/includes/parser/BlockLevelPass.php @@ -38,6 +38,7 @@ const COLON_STATE_COMMENT = 5; const COLON_STATE_COMMENTDASH = 6; const COLON_STATE_COMMENTDASHDASH = 7; + const COLON_STATE_LC = 8; /** * Make lists from lines starting with ':', '*', '#', etc. @@ -389,15 +390,14 @@ * @return string The position of the ':', or false if none found */ private function findColonNoLinks( $str, &$before, &$after ) { - $colonPos = strpos( $str, ':' ); - if ( $colonPos === false ) { + if ( !preg_match( '/:|<|-\{/', $str, $m, PREG_OFFSET_CAPTURE ) ) { # Nothing to find! return false; } - $ltPos = strpos( $str, '<' ); - if ( $ltPos === false || $ltPos > $colonPos ) { + if ( $m[0][0] === ':' ) { # Easy; no tag nesting to worry about + $colonPos = $m[0][1]; $before = substr( $str, 0, $colonPos ); $after = substr( $str, $colonPos + 1 ); return $colonPos; @@ -405,9 +405,10 @@ # Ugly state machine to walk through avoiding tags. $state = self::COLON_STATE_TEXT; - $level = 0; + $ltLevel = 0; + $lcLevel = 0; $len = strlen( $str ); - for ( $i = 0; $i < $len; $i++ ) { + for ( $i = $m[0][1]; $i < $len; $i++ ) { $c = $str[$i]; switch ( $state ) { @@ -418,7 +419,7 @@ $state = self::COLON_STATE_TAGSTART; break; case ":": - if ( $level === 0 ) { + if ( $ltLevel === 0 ) { # We found it! $before = substr( $str, 0, $i ); $after = substr( $str, $i + 1 ); @@ -428,35 +429,44 @@ break; default: # Skip ahead looking for something interesting - $colonPos = strpos( $str, ':', $i ); - if ( $colonPos === false ) { + if ( !preg_match( '/:|<|-\{/', $str, $m, PREG_OFFSET_CAPTURE, $i ) ) { # Nothing else interesting return false; } - $ltPos = strpos( $str, '<', $i ); - if ( $level === 0 ) { - if ( $ltPos === false || $colonPos < $ltPos ) { - # We found it! - $before = substr( $str, 0, $colonPos ); - $after = substr( $str, $colonPos + 1 ); - return $i; - } + if ( $m[0][0] === '-{' ) { + $state = self::COLON_STATE_LC; + $lcLevel++; + $i = $m[0][1] + 1; + } else { + # Skip ahead to next interesting character. + $i = $m[0][1] - 1; } - if ( $ltPos === false ) { -
[MediaWiki-commits] [Gerrit] mediawiki/core[master]: Protect -{...}- variant constructs in definition lists.
C. Scott Ananian has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/327112 ) Change subject: Protect -{...}- variant constructs in definition lists. .. Protect -{...}- variant constructs in definition lists. Given the wikitext: ;-{zh-cn:AAA;zh-tw:BBB}- Prevent `doBlockLevels` from trying to split the definition list at the embedded colon and using `AAA;zh-tw:BBB}-` as the `` portion. Bug: T153135 Change-Id: I3a4d02f1fbd0d0fe8278d6b7c66005f0dd3dd36b --- M includes/parser/BlockLevelPass.php M tests/parser/parserTests.txt 2 files changed, 40 insertions(+), 31 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/mediawiki/core refs/changes/12/327112/1 diff --git a/includes/parser/BlockLevelPass.php b/includes/parser/BlockLevelPass.php index cbacd34..1bb3c49 100644 --- a/includes/parser/BlockLevelPass.php +++ b/includes/parser/BlockLevelPass.php @@ -38,6 +38,7 @@ const COLON_STATE_COMMENT = 5; const COLON_STATE_COMMENTDASH = 6; const COLON_STATE_COMMENTDASHDASH = 7; + const COLON_STATE_LC = 8; /** * Make lists from lines starting with ':', '*', '#', etc. @@ -389,15 +390,14 @@ * @return string The position of the ':', or false if none found */ private function findColonNoLinks( $str, &$before, &$after ) { - $colonPos = strpos( $str, ':' ); - if ( $colonPos === false ) { + if ( !preg_match( '/:|<|-\{/', $str, $m, PREG_OFFSET_CAPTURE ) ) { # Nothing to find! return false; } - $ltPos = strpos( $str, '<' ); - if ( $ltPos === false || $ltPos > $colonPos ) { + if ( $m[0][0] === ':' ) { # Easy; no tag nesting to worry about + $colonPos = $m[0][1]; $before = substr( $str, 0, $colonPos ); $after = substr( $str, $colonPos + 1 ); return $colonPos; @@ -405,9 +405,10 @@ # Ugly state machine to walk through avoiding tags. $state = self::COLON_STATE_TEXT; - $level = 0; + $ltLevel = 0; + $lcLevel = 0; $len = strlen( $str ); - for ( $i = 0; $i < $len; $i++ ) { + for ( $i = $m[0][1]; $i < $len; $i++ ) { $c = $str[$i]; switch ( $state ) { @@ -418,7 +419,7 @@ $state = self::COLON_STATE_TAGSTART; break; case ":": - if ( $level === 0 ) { + if ( $ltLevel === 0 ) { # We found it! $before = substr( $str, 0, $i ); $after = substr( $str, $i + 1 ); @@ -428,35 +429,44 @@ break; default: # Skip ahead looking for something interesting - $colonPos = strpos( $str, ':', $i ); - if ( $colonPos === false ) { + if ( !preg_match( '/:|<|-\{/', $str, $m, PREG_OFFSET_CAPTURE, $i ) ) { # Nothing else interesting return false; } - $ltPos = strpos( $str, '<', $i ); - if ( $level === 0 ) { - if ( $ltPos === false || $colonPos < $ltPos ) { - # We found it! - $before = substr( $str, 0, $colonPos ); - $after = substr( $str, $colonPos + 1 ); - return $i; - } + if ( $m[0][0] === '-{' ) { + $state = self::COLON_STATE_LC; + $lcLevel++; + $i = $m[0][1] + 1; + } else { + # Skip ahead to next interesting character. + $i = $m[0][1] - 1; } - if ( $ltPos === false ) { -