jenkins-bot has submitted this change and it was merged.
Change subject: TextExtracts do not crop after initials
......................................................................
TextExtracts do not crop after initials
Disables sentence termination at a full stop preceeded by a capital
alphabet which is likely to be an initial.
Bug: T115795
Change-Id: Ibf38e87823155c704ffb106642944cbd05e3f632
---
M includes/ExtractFormatter.php
M tests/ExtractFormatterTest.php
2 files changed, 3 insertions(+), 3 deletions(-)
Approvals:
MaxSem: Looks good to me, approved
jenkins-bot: Verified
diff --git a/includes/ExtractFormatter.php b/includes/ExtractFormatter.php
index a6581f3..644dcaa 100644
--- a/includes/ExtractFormatter.php
+++ b/includes/ExtractFormatter.php
@@ -80,7 +80,7 @@
public static function getFirstSentences( $text,
$requestedSentenceCount ) {
// Based on code from OpenSearchXml by Brion Vibber
$endchars = array(
- '\.\s', '\!\s', '\?\s', // regular ASCII
+ '[^\p{Lu}]\.\s', '\!\s', '\?\s', // regular ASCII
'。', // full-width ideographic full-stop
'.', '!', '?', // double-width roman forms
'。', // half-width ideographic full stop
diff --git a/tests/ExtractFormatterTest.php b/tests/ExtractFormatterTest.php
index 227f95c..de39909 100644
--- a/tests/ExtractFormatterTest.php
+++ b/tests/ExtractFormatterTest.php
@@ -109,12 +109,12 @@
1,
'Foo was born in 1977.',
),
- /* @fixme
+ // Bug T115795 - Test no cropping after initials
array(
'P.J. Harvey is a singer. She is awesome!',
1,
'P.J. Harvey is a singer.',
- ),*/
+ ),
);
}
--
To view, visit https://gerrit.wikimedia.org/r/255959
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: Ibf38e87823155c704ffb106642944cbd05e3f632
Gerrit-PatchSet: 3
Gerrit-Project: mediawiki/extensions/TextExtracts
Gerrit-Branch: master
Gerrit-Owner: Sumit <[email protected]>
Gerrit-Reviewer: Jdlrobson <[email protected]>
Gerrit-Reviewer: MaxSem <[email protected]>
Gerrit-Reviewer: Sumit <[email protected]>
Gerrit-Reviewer: Waldir <[email protected]>
Gerrit-Reviewer: jenkins-bot <>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits