jenkins-bot has submitted this change and it was merged.
Change subject: Validate and normalize file contents in FFS
......................................................................
Validate and normalize file contents in FFS
1) Only accept valid UTF-8. Further work could be done to convert
things on the fly, but it is unclear whether that extra complexity
is needed right now and worth the effort. Further work is probably
needed to use better exceptions (MWException is being deprecated)
and handle those exceptions appropriately.
2) Normalize the input to the standard MediaWiki Unicode normalization
which is NFC. There is probably a small (unmeasured) performance penalty
here, but that should be negligible because:
* parsing should only happen when updating group definitions
(known issues exist)
* we are normalizing the whole file before parsing it,
not individual messages
This should prevent any kind of unexpected issues with search,
translation memory, insertables, no-change diffs and many other things.
Bug: T87503
Change-Id: Ib8e0348dd562c7b82b07705fc07d87476f49f961
---
M ffs/SimpleFFS.php
1 file changed, 7 insertions(+), 1 deletion(-)
Approvals:
Amire80: Looks good to me, approved
jenkins-bot: Verified
diff --git a/ffs/SimpleFFS.php b/ffs/SimpleFFS.php
index 7c939c2..765df8e 100644
--- a/ffs/SimpleFFS.php
+++ b/ffs/SimpleFFS.php
@@ -100,7 +100,7 @@
*
* @param string $code Language code.
* @return array|bool False if the file does not exist
- * @throws MWException if the file appears to exist, but cannot be read
+ * @throws MWException if the file is not readable or has bad encoding
*/
public function read( $code ) {
if ( !$this->exists( $code ) ) {
@@ -113,6 +113,12 @@
throw new MWException( "Unable to read file $filename."
);
}
+ if ( !StringUtils::isUtf8( $input ) ) {
+ throw new MWException( "Contents of $filename are not
valid utf-8." );
+ }
+
+ $input = UtfNormal::cleanUp( $input );
+
return $this->readFromVariable( $input );
}
--
To view, visit https://gerrit.wikimedia.org/r/189191
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: Ib8e0348dd562c7b82b07705fc07d87476f49f961
Gerrit-PatchSet: 5
Gerrit-Project: mediawiki/extensions/Translate
Gerrit-Branch: master
Gerrit-Owner: Nikerabbit <[email protected]>
Gerrit-Reviewer: Amire80 <[email protected]>
Gerrit-Reviewer: jenkins-bot <>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits