ArielGlenn has uploaded a new change for review. (
https://gerrit.wikimedia.org/r/362173 )
Change subject: clean up illegal chars in revision text retrieved during flow
content dumps
......................................................................
clean up illegal chars in revision text retrieved during flow content dumps
Bug: T167456
Change-Id: I66dd1f6a5e1b8df11261b037289f9d2bfa9b8512
---
M maintenance/dumpTextPass.php
1 file changed, 2 insertions(+), 0 deletions(-)
git pull ssh://gerrit.wikimedia.org:29418/mediawiki/extensions/Flow
refs/changes/73/362173/1
diff --git a/maintenance/dumpTextPass.php b/maintenance/dumpTextPass.php
index 40773e2..e874be2 100644
--- a/maintenance/dumpTextPass.php
+++ b/maintenance/dumpTextPass.php
@@ -207,6 +207,8 @@
$tryIsPrefetch = false;
$text = $revision->getContent( $format
);
if ( $text !== false ) {
+ // filter out any illegal
characters that might have been in revision text in database
+ $text = preg_replace(
'/[^\x{0009}\x{000a}\x{000d}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}]+/u', ' ', $text
);
return $text;
}
}
--
To view, visit https://gerrit.wikimedia.org/r/362173
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I66dd1f6a5e1b8df11261b037289f9d2bfa9b8512
Gerrit-PatchSet: 1
Gerrit-Project: mediawiki/extensions/Flow
Gerrit-Branch: master
Gerrit-Owner: ArielGlenn <[email protected]>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits