Smalyshev has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/326037 )

Change subject: [WIP][DNM] Allow extensions to hook features
......................................................................

[WIP][DNM] Allow extensions to hook features

Also move GeoFeature to GeoData extension.

Change-Id: Id08efd46337a977639ebf3724ee3492512f326ac
---
M README
M autoload.php
A docs/hooks.txt
D includes/Query/GeoFeature.php
M includes/Search/RescoreBuilders.php
M includes/Search/SearchContext.php
M includes/Searcher.php
M profiles/RescoreProfiles.config.php
D tests/unit/Query/GeoFeatureTest.php
9 files changed, 166 insertions(+), 638 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/mediawiki/extensions/CirrusSearch 
refs/changes/37/326037/1

diff --git a/README b/README
index 23f5f92..f76813f 100644
--- a/README
+++ b/README
@@ -260,27 +260,7 @@
 
 Hooks
 -----
-CirrusSearch provides hooks that other extensions can make use of to extend 
the core schema and
-modify documents.
-
-There are currently two phases to building cirrus documents: the parse phase 
and the links phase.
-The parse phase then the links phase is run when the article's rendered text 
would change (actual
-article change and template change).  Only the links phase is run when an 
article is newly links
-or unlinked.
-
-Note that this whole thing is a somewhat experimental feature at this point 
and the API hasn't
-really been settled.
-
-'CirrusSearchAnalysisConfig': Allows to hook into the configuration for 
analysis
- &config - multi-dimensional configuration array for analysis of various 
languages and fields
- $builder - instance of MappingConfigBuilder, for easier use of utility 
methods to build fields
-
-'CirrusSearchMappingConfig': Allows configuration of the mapping of fields
- &config - multi-dimensional configuration array that contains Elasticsearch 
document configuration.
-   The 'page' index contains configuration for Elasticsearch documents 
representing pages.
-   The 'namespace' index contains namespace configuration for Elasticsearch 
documents representing
-   namespaces.
-
+See docs/hooks.txt.
 
 Validating a new version of Elasticsearch
 -----------------------------------------
diff --git a/autoload.php b/autoload.php
index 92f7d56..8baa1db 100644
--- a/autoload.php
+++ b/autoload.php
@@ -112,7 +112,6 @@
        'CirrusSearch\\Query\\FullTextQueryBuilder' => __DIR__ . 
'/includes/Query/FullTextQueryBuilder.php',
        'CirrusSearch\\Query\\FullTextQueryStringQueryBuilder' => __DIR__ . 
'/includes/Query/FullTextQueryStringQueryBuilder.php',
        'CirrusSearch\\Query\\FullTextSimpleMatchQueryBuilder' => __DIR__ . 
'/includes/Query/FullTextSimpleMatchQueryBuilder.php',
-       'CirrusSearch\\Query\\GeoFeature' => __DIR__ . 
'/includes/Query/GeoFeature.php',
        'CirrusSearch\\Query\\HasTemplateFeature' => __DIR__ . 
'/includes/Query/HasTemplateFeature.php',
        'CirrusSearch\\Query\\InCategoryFeature' => __DIR__ . 
'/includes/Query/InCategoryFeature.php',
        'CirrusSearch\\Query\\InTitleFeature' => __DIR__ . 
'/includes/Query/InTitleFeature.php',
diff --git a/docs/hooks.txt b/docs/hooks.txt
new file mode 100644
index 0000000..c2e32ba
--- /dev/null
+++ b/docs/hooks.txt
@@ -0,0 +1,47 @@
+CirrusSearch provides hooks that other extensions can make use of to extend 
the core schema and
+modify documents.
+
+There are currently two phases to building cirrus documents: the parse phase 
and the links phase.
+The parse phase then the links phase is run when the article's rendered text 
would change (actual
+article change and template change).  Only the links phase is run when an 
article is newly links
+or unlinked.
+
+Note that this whole thing is a somewhat experimental feature at this point 
and the API hasn't
+really been settled.
+
+'CirrusSearchAnalysisConfig': Allows to hook into the configuration for 
analysis
+ &$config - multi-dimensional configuration array for analysis of various 
languages and fields
+
+'CirrusSearchMappingConfig': Allows configuration of the mapping of fields
+ &$config - multi-dimensional configuration array that contains Elasticsearch 
document configuration.
+   The 'page' index contains configuration for Elasticsearch documents 
representing pages.
+   The 'namespace' index contains namespace configuration for Elasticsearch 
documents representing
+   namespaces.
+ $builder - instance of MappingConfigBuilder, for easier use of utility 
methods to build fields.
+
+'CirrusSearchBuildDocumentParse': Allows extensions to modify ElasticSearch 
document produced from a page
+ $doc - \Elastica\Document object representing the page. Extensions can modify 
it.
+ $title - Title object representing the page.
+ $content - Content object for the page.
+ $parserOutput - ParserOutput for the page, if exists, or null.
+
+'CirrusSearchBuildDocumentLinks': Allows extensions to process incoming and 
outgoing links for the document.
+ $doc - \Elastica\Document object representing the page. Extensions can add 
links to it.
+ $title - Title object representing the page.
+ $connection - \CirrusSearch\Connection object representing connection to 
ElasticSearch server.
+
+'CirrusSearchBuildDocumentFinishBatch': Called when batch of pages has been 
indexed.
+ $pages - list of WikiPage objects which have been indexed.
+
+'CirrusSearchAddQueryFeatures': Allows extensions to add query parser features
+ &config - SearchConfig object which holds current search configuration
+ &$$extraFeatures - array holding feature objects. This is where the extension 
should add its features.
+ The feature class should implement \CirrusSearch\Query\KeywordFeature.
+
+'CirrusSearchScoreBuilder': Allows extensions to define rescore builder 
functions
+ $func - function definition map, with values:
+   type - function name
+   For other parameter examples, see RescoreProfiles.config.php
+ $context - SearchContext object
+ $weight - score weight
+ &$builder - object implementing the function. Should be
\ No newline at end of file
diff --git a/includes/Query/GeoFeature.php b/includes/Query/GeoFeature.php
deleted file mode 100644
index 33e1b73..0000000
--- a/includes/Query/GeoFeature.php
+++ /dev/null
@@ -1,219 +0,0 @@
-<?php
-
-namespace CirrusSearch\Query;
-
-use CirrusSearch\Search\SearchContext;
-use CirrusSearch\SearchConfig;
-use Elastica\Query\AbstractQuery;
-use GeoData\GeoData;
-use GeoData\Coord;
-use GeoData\Globe;
-use Title;
-
-/**
- * Applies geo based features to the query.
- *
- * Two forms of geo based querying are provided: a filter that limits search
- * results to a geographic area and a boost that increases the score of
- * results within the geographic area. Supports specifying geo coordinates
- * either by providing a latitude and longitude, or a page title to source the
- * latitude and longitude from. All values can be prefixed with a radius in m
- * or km to apply. If not specified this defaults to 5km.
- *
- * Examples:
- *  neartitle:Shanghai
- *  neartitle:50km,Seoul
- *  nearcoord:1.2345,-5.4321
- *  nearcoord:17km,54.321,-12.345
- *  boost-neartitle:"San Francisco"
- *  boost-neartitle:50km,Kampala
- *  boost-nearcoord:-12.345,87.654
- *  boost-nearcoord:77km,34.567,76.543
- */
-class GeoFeature extends SimpleKeywordFeature {
-       // Default radius, in meters
-       const DEFAULT_RADIUS = 5000;
-       // Default globe
-       const DEFAULT_GLOBE = 'earth';
-
-       /**
-        * @return string[]
-        */
-       protected function getKeywords() {
-               return ['boost-nearcoord', 'boost-neartitle', 'nearcoord', 
'neartitle'];
-       }
-
-       /**
-        * @param SearchContext $context
-        * @param string $key The keyword
-        * @param string $value The value attached to the keyword with quotes 
stripped
-        * @param string $quotedValue The original value in the search string, 
including quotes if used
-        * @param bool $negated Is the search negated? Not used to generate the 
returned AbstractQuery,
-        *  that will be negated as necessary. Used for any other 
building/context necessary.
-        * @return array Two element array, first an AbstractQuery or null to 
apply to the
-        *  query. Second a boolean indicating if the quotedValue should be 
kept in the search
-        *  string.
-        */
-       protected function doApply( SearchContext $context, $key, $value, 
$quotedValue, $negated ) {
-               if ( !class_exists( GeoData::class ) ) {
-                       return [ null, false ];
-               }
-
-               if ( substr( $key, -5 ) === 'title' ) {
-                       list( $coord, $radius, $excludeDocId ) = 
$this->parseGeoNearbyTitle(
-                               $context->getConfig(),
-                               $value
-                       );
-               } else {
-                       list( $coord, $radius ) = $this->parseGeoNearby( $value 
);
-                       $excludeDocId = '';
-               }
-
-               $filter = null;
-               if ( $coord ) {
-                       if ( substr( $key, 0, 6 ) === 'boost-' ) {
-                               $context->addGeoBoost( $coord, $radius, 
$negated ? 0.1 : 1 );
-                       } else {
-                               $filter = self::createQuery( $coord, $radius, 
$excludeDocId );
-                       }
-               }
-
-               return [ $filter, false ];
-       }
-
-       /**
-        * radius, if provided, must have either m or km suffix. Valid formats:
-        *   <title>
-        *   <radius>,<title>
-        *
-        * @param SearchConfig $config the Cirrus config object
-        * @param string $text user input to parse
-        * @return array Three member array with Coordinate object, integer 
radius
-        *  in meters, and page id to exclude from results.. When invalid the
-        *  Coordinate returned will be null.
-        */
-       public function parseGeoNearbyTitle( SearchConfig $config, $text ) {
-               $title = Title::newFromText( $text );
-               if ( $title && $title->exists() ) {
-                       // Default radius if not provided: 5km
-                       $radius = self::DEFAULT_RADIUS;
-               } else {
-                       // If the provided value is not a title try to extract 
a radius prefix
-                       // from the beginning. If $text has a valid radius 
prefix see if the
-                       // remaining text is a valid title to use.
-                       $pieces = explode( ',', $text, 2 );
-                       if ( count( $pieces ) !== 2 ) {
-                               return [ null, 0, '' ];
-                       }
-                       $radius = $this->parseDistance( $pieces[0] );
-                       if ( $radius === null ) {
-                               return [ null, 0, '' ];
-                       }
-                       $title = Title::newFromText( $pieces[1] );
-                       if ( !$title || !$title->exists() ) {
-                               return [ null, 0, '' ];
-                       }
-               }
-
-               $coord = GeoData::getPageCoordinates( $title );
-               if ( !$coord ) {
-                       return [ null, 0, '' ];
-               }
-
-               return [ $coord, $radius, $config->makeId( 
$title->getArticleID() ) ];
-       }
-
-       /**
-        * radius, if provided, must have either m or km suffix. Latitude and 
longitude
-        * must be floats in the domain of [-90:90] for latitude and [-180,180] 
for
-        * longitude. Valid formats:
-        *   <lat>,<lon>
-        *   <radius>,<lat>,<lon>
-        *
-        * @param string $text
-        * @return array Two member array with Coordinate object, and integer 
radius
-        *  in meters. When invalid the Coordinate returned will be null.
-        */
-       public function parseGeoNearby( $text ) {
-               $pieces = explode( ',', $text, 3 );
-               // Default radius if not provided: 5km
-               $radius = self::DEFAULT_RADIUS;
-               if ( count( $pieces ) === 3 ) {
-                       $radius = $this->parseDistance( $pieces[0] );
-                       if ( $radius === null ) {
-                               return [ null, 0 ];
-                       }
-                       $lat = $pieces[1];
-                       $lon = $pieces[2];
-               } elseif ( count( $pieces ) === 2 ) {
-                       $lat = $pieces[0];
-                       $lon = $pieces[1];
-               } else {
-                       return [ null, 0 ];
-               }
-
-               $globe = new Globe( self::DEFAULT_GLOBE );
-               if ( !$globe->coordinatesAreValid( $lat, $lon ) ) {
-                       return [ null, 0 ];
-               }
-
-               return [
-                       new Coord( floatval( $lat ), floatval( $lon ), 
$globe->getName() ),
-                       $radius,
-               ];
-       }
-
-       /**
-        * @param string $distance
-        * @return int|null Parsed distance in meters, or null if unparsable
-        */
-       public function parseDistance( $distance ) {
-               if ( !preg_match( '/^(\d+)(m|km|mi|ft|yd)$/', $distance, 
$matches ) ) {
-                       return null;
-               }
-
-               $scale = [
-                       'm' => 1,
-                       'km' => 1000,
-                       // Supported non-SI units, and their conversions, 
sourced from
-                       // 
https://en.wikipedia.org/wiki/Unit_of_length#Imperial.2FUS
-                       'mi' => 1609.344,
-                       'ft' => 0.3048,
-                       'yd' => 0.9144,
-               ];
-
-               return max( 10, (int) round( $matches[1] * $scale[$matches[2]] 
) );
-       }
-
-       /**
-        * Create a filter for near: and neartitle: queries.
-        *
-        * @param Coord $coord
-        * @param int $radius Search radius in meters
-        * @param string $docIdToExclude Document id to exclude, or "" for no 
exclusions.
-        * @return AbstractQuery
-        */
-       public static function createQuery( Coord $coord, $radius, 
$docIdToExclude = '' ) {
-               $query = new \Elastica\Query\BoolQuery();
-               $query->addFilter( new \Elastica\Query\Term( [ 
'coordinates.globe' => $coord->globe ] ) );
-               $query->addFilter( new \Elastica\Query\Term( [ 
'coordinates.primary' => 1 ] ) );
-
-               $distanceFilter = new \Elastica\Query\GeoDistance(
-                       'coordinates.coord',
-                       [ 'lat' => $coord->lat, 'lon' => $coord->lon ],
-                       $radius . 'm'
-               );
-               $distanceFilter->setOptimizeBbox( 'indexed' );
-               $query->addFilter( $distanceFilter );
-
-               if ( $docIdToExclude !== '' ) {
-                       $query->addMustNot( new \Elastica\Query\Term( [ '_id' 
=> $docIdToExclude ] ) );
-               }
-
-               $nested = new \Elastica\Query\Nested();
-               $nested->setPath( 'coordinates' )->setQuery( $query );
-
-               return $nested;
-       }
-
-}
diff --git a/includes/Search/RescoreBuilders.php 
b/includes/Search/RescoreBuilders.php
index caf2d50..9e59ab7 100644
--- a/includes/Search/RescoreBuilders.php
+++ b/includes/Search/RescoreBuilders.php
@@ -2,10 +2,10 @@
 
 namespace CirrusSearch\Search;
 
-use CirrusSearch\Query\GeoFeature;
 use CirrusSearch\Util;
 use Elastica\Query\FunctionScore;
 use Elastica\Query\AbstractQuery;
+use Hooks;
 use MWNamespace;
 
 /**
@@ -97,6 +97,7 @@
         *
         * @param array $rescoreDef
         * @return FunctionScore|null the rescore query
+        * @throws InvalidRescoreProfileException
         */
        private function buildRescoreQuery( array $rescoreDef ) {
                switch( $rescoreDef['type'] ) {
@@ -127,6 +128,7 @@
         *
         * @param array $profile
         * @return array the supported rescore profile.
+        * @throws InvalidRescoreProfileException
         */
        private function getSupportedProfile( array $profile ) {
                if ( !is_array( $profile['supported_namespaces'] ) ) {
@@ -167,6 +169,7 @@
        /**
         * @param string $profileName the profile to load
         * @return array the rescore profile identified by $profileName
+        * @throws InvalidRescoreProfileException
         */
        private function getFallbackProfile( $profileName ) {
                $profile = $this->context->getConfig()->getElement( 
'CirrusSearchRescoreProfiles', $profileName );
@@ -221,8 +224,9 @@
         * Builds a new function score chain.
         *
         * @param SearchContext $context
-        * @param string $chainName the name of the chain (must be a valid
+        * @param string        $chainName the name of the chain (must be a 
valid
         *  chain in wgCirrusSearchRescoreFunctionScoreChains)
+        * @throws InvalidRescoreProfileException
         */
        public function __construct( SearchContext $context, $chainName ) {
                $this->chainName = $chainName;
@@ -242,6 +246,7 @@
        /**
         * @return FunctionScore|null the rescore query or null none of 
functions were
         *  needed.
+        * @throws InvalidRescoreProfileException
         */
        public function buildRescoreQuery() {
                if ( !isset( $this->chain['functions'] ) ) {
@@ -250,6 +255,12 @@
                foreach( $this->chain['functions'] as $func ) {
                        $impl = $this->getImplementation( $func );
                        $impl->append( $this->functionScore );
+               }
+               // Add extensions
+               if ( !empty( $this->chain['add_extensions'] ) ) {
+                       foreach ( $this->context->getExtraScoreBuilders() as 
$extBuilder ) {
+                               $extBuilder->append( $this->functionScore );
+                       }
                }
                if ( !$this->functionScore->isEmptyFunction() ) {
                        return $this->functionScore;
@@ -260,6 +271,7 @@
        /**
         * @param array $func
         * @return FunctionScoreBuilder
+        * @throws InvalidRescoreProfileException
         */
        private function getImplementation( $func ) {
                $weight = isset ( $func['weight'] ) ? $func['weight'] : 1;
@@ -286,10 +298,13 @@
                        return new LogMultFunctionScoreBuilder( $this->context, 
$weight,  $func['params'] );
                case 'geomean':
                        return new GeoMeanFunctionScoreBuilder( $this->context, 
$weight,  $func['params'] );
-               case 'georadius':
-                       return new GeoRadiusFunctionScoreBuilder( 
$this->context, $weight );
                default:
-                       throw new InvalidRescoreProfileException( "Unknown 
function score type {$func['type']}." );
+                       $builder = null;
+                       Hooks::run( 'CirrusSearchScoreBuilder', [ $func, 
$this->context, &$builder ] );
+                       if ( !$builder ) {
+                               throw new InvalidRescoreProfileException( 
"Unknown function score type {$func['type']}." );
+                       }
+                       return $builder;
                }
        }
 }
@@ -635,8 +650,9 @@
 
        /**
         * @param SearchContext $context
-        * @param float $weight
-        * @param array $profile
+        * @param float         $weight
+        * @param array         $profile
+        * @throws InvalidRescoreProfileException
         */
        public function __construct( SearchContext $context, $weight, $profile 
) {
                parent::__construct( $context, $weight );
@@ -667,6 +683,7 @@
         * @param float $M
         * @param float $N
         * @return float
+        * @throws InvalidRescoreProfileException
         */
        private function findCenterFactor( $M, $N ) {
                // Neutral point is found by resolving
@@ -719,8 +736,9 @@
 
        /**
         * @param SearchContext $context
-        * @param float $weight
-        * @param array $profile
+        * @param float         $weight
+        * @param array         $profile
+        * @throws InvalidRescoreProfileException
         */
        public function __construct( SearchContext $context, $weight, $profile 
) {
                parent::__construct( $context, $weight );
@@ -779,8 +797,9 @@
 
        /**
         * @param SearchContext $context
-        * @param float $weight
-        * @param array $profile
+        * @param float         $weight
+        * @param array         $profile
+        * @throws InvalidRescoreProfileException
         */
        public function __construct( SearchContext $context, $weight, $profile 
) {
                parent::__construct( $context, $weight );
@@ -834,8 +853,9 @@
 
        /**
         * @param SearchContext $context
-        * @param float $weight
-        * @param array $profile
+        * @param float         $weight
+        * @param array         $profile
+        * @throws InvalidRescoreProfileException
         */
        public function __construct( SearchContext $context, $weight, $profile 
) {
                parent::__construct( $context, $weight );
@@ -951,22 +971,6 @@
                }
                $functionScore->addScriptScoreFunction( new 
\Elastica\Script\Script( $exponentialDecayExpression,
                        $parameters, 'expression' ), null, $this->weight );
-       }
-}
-
-/**
- * Builds a boost for documents based on geocoordinates.
- * Reads its params from SearchContext::geoBoost. Initialized
- * by special syntax in user query.
- */
-class GeoRadiusFunctionScoreBuilder extends FunctionScoreBuilder {
-       public function append( FunctionScore $functionScore ) {
-               foreach ( $this->context->getGeoBoosts() as $config ) {
-                       $functionScore->addWeightFunction(
-                               $this->weight * $config['weight'],
-                               GeoFeature::createQuery( $config['coord'], 
$config['radius'] )
-                       );
-               }
        }
 }
 
diff --git a/includes/Search/SearchContext.php 
b/includes/Search/SearchContext.php
index 68cf13d..7cb6e5a 100644
--- a/includes/Search/SearchContext.php
+++ b/includes/Search/SearchContext.php
@@ -69,10 +69,9 @@
        private $rescoreProfile;
 
        /**
-        * @var array[] nested array of arrays. Each child array contains three 
keys:
-        * coord, radius and weight. Used for geographic radius boosting.
+        * @var FunctionScoreBuilder[] Extra scoring builders to use.
         */
-       private $geoBoosts = [];
+       private $extraScoreBuilders = [];
 
        /**
         * @var bool Could this query possibly return results?
@@ -184,6 +183,7 @@
                'near_match' => 10,
                'prefix' => 2,
        ];
+
        /**
         * @param SearchConfig $config
         * @param int[]|null $namespaces
@@ -316,31 +316,10 @@
        }
 
        /**
-        * @param string the rescore profile to use
+        * @param string $rescoreProfile the rescore profile to use
         */
        public function setRescoreProfile( $rescoreProfile ) {
                $this->rescoreProfile = $rescoreProfile;
-       }
-
-       /**
-        * @return array[] nested array of arrays. Each child array contains 
three keys:
-        * coord, radius and weight
-        */
-       public function getGeoBoosts() {
-               return $this->geoBoosts;
-       }
-
-       /**
-        * @param Coord $coord Coordinates to boost near
-        * @param int $radius radius to boost within, in meters
-        * @param float $weight Number to multiply score by when within radius
-        */
-       public function addGeoBoost( Coord $coord, $radius, $weight ) {
-               $this->geoBoosts[] = [
-                       'coord' => $coord,
-                       'radius' => $radius,
-                       'weight' => $weight,
-               ];
        }
 
        /**
@@ -474,7 +453,7 @@
        }
 
        /**
-        * @param AbstractQuery Query that should be used for highlighting if 
different
+        * @param AbstractQuery $query Query that should be used for 
highlighting if different
         *  from the query used for selecting.
         */
        public function setHighlightQuery( AbstractQuery $query ) {
@@ -491,6 +470,7 @@
        }
 
        /**
+        * @param ResultsType $resultsType
         * @return array|null Highlight portion of query to be sent to 
elasticsearch
         */
        public function getHighlight( ResultsType $resultsType ) {
@@ -710,7 +690,8 @@
        }
 
        /**
-        * @param string set the original search term
+        * Set the original search term
+        * @param string $term
         */
        public function setOriginalSearchTerm( $term ) {
                $this->originalSearchTerm = $term;
@@ -722,4 +703,21 @@
        public function escaper() {
                return $this->escaper;
        }
+
+       /**
+        * @return FunctionScoreBuilder[]
+        */
+       public function getExtraScoreBuilders() {
+               return $this->extraScoreBuilders;
+       }
+
+       /**
+        * Add custom scoring function to the context.
+        * The rescore builder will pick it up.
+        * @param FunctionScoreBuilder $rescore
+        */
+       public function addCustomRescoreComponent( FunctionScoreBuilder 
$rescore ) {
+               $this->extraScoreBuilders[] = $rescore;
+       }
+
 }
diff --git a/includes/Searcher.php b/includes/Searcher.php
index 9aafbc8..92fc301 100644
--- a/includes/Searcher.php
+++ b/includes/Searcher.php
@@ -2,12 +2,14 @@
 
 namespace CirrusSearch;
 
+use CirrusSearch\Query\KeywordFeature;
 use CirrusSearch\Search\FullTextResultsType;
 use CirrusSearch\Search\ResultsType;
 use CirrusSearch\Search\RescoreBuilder;
 use CirrusSearch\Search\SearchContext;
 use CirrusSearch\Query\FullTextQueryBuilder;
 use CirrusSearch\Elastica\MultiSearch as MultiSearch;
+use Elastica\Exception\RuntimeException;
 use Language;
 use MediaWiki\Logger\LoggerFactory;
 use MediaWiki\MediaWikiServices;
@@ -17,6 +19,8 @@
 use ApiUsageException;
 use UsageException;
 use User;
+use Hooks;
+
 
 /**
  * Performs searches using Elasticsearch.  Note that each instance of this 
class
@@ -293,45 +297,63 @@
                $builderProfile = $this->config->get( 
'CirrusSearchFullTextQueryBuilderProfile' );
                $builderSettings = $this->config->getElement( 
'CirrusSearchFullTextQueryBuilderProfiles', $builderProfile );
 
+               $features = [
+                       // Handle morelike keyword (greedy). This needs to be 
the
+                       // very first item until combining with other queries
+                       // is worked out.
+                       new Query\MoreLikeFeature( $this->config ),
+                       // Handle title prefix notation (greedy)
+                       new Query\PrefixFeature(),
+                       // Handle prefer-recent keyword
+                       new Query\PreferRecentFeature( $this->config ),
+                       // Handle local keyword
+                       new Query\LocalFeature(),
+                       // Handle insource keyword using regex
+                       new Query\RegexInSourceFeature( $this->config ),
+                       // Handle boost-templates keyword
+                       new Query\BoostTemplatesFeature(),
+                       // Handle hastemplate keyword
+                       new Query\HasTemplateFeature(),
+                       // Handle linksto keyword
+                       new Query\LinksToFeature(),
+                       // Handle incategory keyword
+                       new Query\InCategoryFeature( $this->config ),
+                       // Handle non-regex insource keyword
+                       new Query\SimpleInSourceFeature(),
+                       // Handle intitle keyword
+                       new Query\InTitleFeature(),
+                       // inlanguage keyword
+                       new Query\LanguageFeature(),
+                       // File types
+                       new Query\FileTypeFeature(),
+                       // File numeric characteristics - size, resolution, etc.
+                       new Query\FileNumericFeature(),
+               ];
+
+               $extraFeatures = [];
+               Hooks::run( 'CirrusSearchAddQueryFeatures', [ $this->config, 
&$extraFeatures ] );
+               foreach ( $extraFeatures as $extra ) {
+                       if ( $extra instanceof KeywordFeature ) {
+                               $features[] = $extra;
+                       } else {
+                               LoggerFactory::getInstance( 'CirrusSearch' )
+                                       ->warning( 'Skipped invalid feature of 
class ' . get_class( $extra ) );
+                       }
+               }
+
+               /** @var FullTextQueryBuilder $qb */
                $qb = new $builderSettings['builder_class'](
                        $this->config,
-                       [
-                               // Handle morelike keyword (greedy). This needs 
to be the
-                               // very first item until combining with other 
queries
-                               // is worked out.
-                               new Query\MoreLikeFeature( $this->config ),
-                               // Handle title prefix notation (greedy)
-                               new Query\PrefixFeature(),
-                               // Handle prefer-recent keyword
-                               new Query\PreferRecentFeature( $this->config ),
-                               // Handle local keyword
-                               new Query\LocalFeature(),
-                               // Handle insource keyword using regex
-                               new Query\RegexInSourceFeature( $this->config ),
-                               // Handle neartitle, nearcoord keywords, and 
their boosted alternates
-                               new Query\GeoFeature(),
-                               // Handle boost-templates keyword
-                               new Query\BoostTemplatesFeature(),
-                               // Handle hastemplate keyword
-                               new Query\HasTemplateFeature(),
-                               // Handle linksto keyword
-                               new Query\LinksToFeature(),
-                               // Handle incategory keyword
-                               new Query\InCategoryFeature( $this->config ),
-                               // Handle non-regex insource keyword
-                               new Query\SimpleInSourceFeature(),
-                               // Handle intitle keyword
-                               new Query\InTitleFeature(),
-                               // inlanguage keyword
-                               new Query\LanguageFeature(),
-                               // File types
-                               new Query\FileTypeFeature(),
-                               // File numeric characteristics - size, 
resolution, etc.
-                               new Query\FileNumericFeature(),
-                       ],
+                       $features,
                        $builderSettings['settings']
                );
 
+
+
+               if ( !( $qb instanceof FullTextQueryBuilder ) ) {
+                       throw new RuntimeException( "Bad builder class 
configured: {$builderSettings['builder_class']}" );
+               }
+
                $showSuggestion = $showSuggestion && $this->offset == 0
                        && $this->config->get( 
'CirrusSearchEnablePhraseSuggest' );
                $qb->build( $this->searchContext, $term, $showSuggestion );
diff --git a/profiles/RescoreProfiles.config.php 
b/profiles/RescoreProfiles.config.php
index 4725b67..3ffc62e 100644
--- a/profiles/RescoreProfiles.config.php
+++ b/profiles/RescoreProfiles.config.php
@@ -185,15 +185,8 @@
                        // Scores documents according to their language,
                        // See $wgCirrusSearchLanguageWeight
                        [ 'type' => 'language' ],
-
-                       // Boosts documents in a particular geographic area.
-                       // Triggered by query syntax.
-                       [ 'type' => 'georadius', 'weight' => [
-                               'value' => 2,
-                               'config_override' => 
'CirrusSearchPreferGeoRadiusWeight',
-                               'uri_param_override' => 
'cirrusPreferGeoRadiusWeight',
-                       ] ],
-               ]
+               ],
+               'add_extensions' => true
        ],
        // Chain with optional functions if classic_allinone_chain
        // or optional_chain is omitted from the rescore profile then some
@@ -204,12 +197,8 @@
                        [ 'type' => 'templates' ],
                        [ 'type' => 'namespaces' ],
                        [ 'type' => 'language' ],
-                       [ 'type' => 'georadius', 'weight' => [
-                               'value' => 2,
-                               'config_override' => 
'CirrusSearchPreferGeoRadiusWeight',
-                               'uri_param_override' => 
'cirrusPreferGeoRadiusWeight',
-                       ] ],
-               ]
+               ],
+               'add_extensions' => true
        ],
        // Chain with boostlinks only
        'boostlinks_only_chain' => [
diff --git a/tests/unit/Query/GeoFeatureTest.php 
b/tests/unit/Query/GeoFeatureTest.php
deleted file mode 100644
index 126657c..0000000
--- a/tests/unit/Query/GeoFeatureTest.php
+++ /dev/null
@@ -1,292 +0,0 @@
-<?php
-
-namespace CirrusSearch\Query;
-
-use CirrusSearch\CirrusTestCase;
-use CirrusSearch\SearchConfig;
-use GeoData\Coord;
-use LoadBalancer;
-use IDatabase;
-use MediaWiki\MediaWikiServices;
-use Title;
-
-/**
- * Test GeoFeature functions.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License along
- * with this program; if not, write to the Free Software Foundation, Inc.,
- * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
- * http://www.gnu.org/copyleft/gpl.html
- *
- * @group CirrusSearch
- */
-class GeoFeatureTest extends CirrusTestCase {
-
-       public function parseDistanceProvider() {
-               return [
-                       'unknown units returns null' => [
-                               null,
-                               '100fur',
-                       ],
-                       'gibberish returns null' => [
-                               null,
-                               'gibberish',
-                       ],
-                       'no space allowed between numbers and units' => [
-                               null,
-                               '100 m',
-                       ],
-                       'meters' => [
-                               100,
-                               '100m',
-                       ],
-                       'kilometers' => [
-                               1000,
-                               '1km',
-                       ],
-                       'yards' => [
-                               366,
-                               '400yd',
-                       ],
-                       'one mile rounds down' => [
-                               1609,
-                               '1mi',
-                       ],
-                       'two miles rounds up' => [
-                               '3219',
-                               '2mi',
-                       ],
-                       '1000 feet rounds up' => [
-                               305,
-                               '1000ft',
-                       ],
-                       '3000 feet rounds down' => [
-                               914,
-                               '3000ft',
-                       ],
-                       'small requests are bounded' => [
-                               10,
-                               '1ft',
-                       ],
-                       'allows large inputs' => [
-                               4321000,
-                               '4321km',
-                       ],
-               ];
-       }
-
-       /**
-        * @dataProvider parseDistanceProvider
-        */
-       public function testParseDistance( $expected, $distance ) {
-               if ( class_exists( Coord::class ) ) {
-                       $feature = new GeoFeature();
-                       $this->assertEquals( $expected, 
$feature->parseDistance( $distance ) );
-               } else {
-                       $this->markTestSkipped( 'GeoData extension must be 
installed' );
-               }
-       }
-
-       public function parseGeoNearbyProvider() {
-               return [
-                       'random input' => [
-                               [ null, 0 ],
-                               'gibberish'
-                       ],
-                       'random input with comma' => [
-                               [ null, 0 ],
-                               'gibberish,42.42'
-                       ],
-                       'random input with valid radius prefix' => [
-                               [ null, 0 ],
-                               '20km,42.42,invalid',
-                       ],
-                       'valid coordinate, default radius' => [
-                               [
-                                       [ 'lat' => 1.2345, 'lon' => 2.3456 ],
-                                       5000,
-                               ],
-                               '1.2345,2.3456',
-                       ],
-                       'valid coordinate, specific radius in meters' => [
-                               [
-                                       [ 'lat' => -5.4321, 'lon' => 42.345 ],
-                                       4321,
-                               ],
-                               '4321m,-5.4321,42.345',
-                       ],
-                       'valid coordinate, specific radius in kilmeters' => [
-                               [
-                                       [ 'lat' => 0, 'lon' => 42.345 ],
-                                       7000,
-                               ],
-                               '7km,0,42.345',
-                       ],
-                       'out of bounds positive latitude' => [
-                               [ null, 0 ],
-                               '90.1,0'
-                       ],
-                       'out of bounds negative latitude' => [
-                               [ null, 0 ],
-                               '-90.1,17',
-                       ],
-                       'out of bounds positive longitude' => [
-                               [ null, 0 ],
-                               '49,180.1',
-                       ],
-                       'out of bounds negative longitude' => [
-                               [ null, 0 ],
-                               '49,-180.001',
-                       ],
-                       'valid coordinate with spaces' => [
-                               [
-                                       [ 'lat' => 1.2345, 'lon' => 9.8765 ],
-                                       5000
-                               ],
-                               '1.2345, 9.8765'
-                       ],
-               ];
-       }
-
-       /**
-        * @dataProvider parseGeoNearbyProvider
-        */
-       public function testParseGeoNearby( $expected, $value ) {
-               if ( class_exists( Coord::class ) ) {
-                       $feature = new GeoFeature;
-                       $result = $feature->parseGeoNearby( $value );
-                       if ( $result[0] instanceof Coord ) {
-                               $result[0] = [ 'lat' => $result[0]->lat, 'lon' 
=> $result[0]->lon ];
-                       }
-                       $this->assertEquals( $expected, $result );
-               } else {
-                       $this->markTestSkipped( 'GeoData extension must be 
installed' );
-               }
-       }
-
-       public function parseGeoNearbyTitleProvider() {
-               return [
-                       'basic page lookup' => [
-                               [
-                                       [ 'lat' => 1.2345, 'lon' => 5.4321 ],
-                                       5000,
-                                       7654321,
-                               ],
-                               'San Francisco'
-                       ],
-                       'basic page lookup with radius in meters' => [
-                               [
-                                       [ 'lat' => 1.2345, 'lon' => 5.4321 ],
-                                       1234,
-                                       7654321,
-                               ],
-                               '1234m,San Francisco'
-                       ],
-                       'basic page lookup with radius in kilometers' => [
-                               [
-                                       [ 'lat' => 1.2345, 'lon' => 5.4321 ],
-                                       2000,
-                                       7654321,
-                               ],
-                               '2km,San Francisco'
-                       ],
-                       'basic page lookup with space between radius and name' 
=> [
-                               [
-                                       [ 'lat' => 1.2345, 'lon' => 5.4321 ],
-                                       2000,
-                                       7654321,
-                               ],
-                               '2km, San Francisco'
-                       ],
-                       'page with comma in name' => [
-                               [
-                                       [ 'lat' => 1.2345, 'lon' => 5.4321 ],
-                                       5000,
-                                       1234567,
-                               ],
-                               'Washington, D.C.'
-                       ],
-                       'page with comma in name and radius in kilometers' => [
-                               [
-                                       [ 'lat' => 1.2345, 'lon' => 5.4321 ],
-                                       7000,
-                                       1234567,
-                               ],
-                               '7km,Washington, D.C.'
-                       ],
-                       'unknown page lookup' => [
-                               [ null, 0, '' ],
-                               'Unknown Title',
-                       ],
-                       'unknown page lookup with radius' => [
-                               [ null, 0, '' ],
-                               '4km, Unknown Title',
-                       ],
-               ];
-       }
-
-       /**
-        * @dataProvider parseGeoNearbyTitleProvider
-        */
-       public function testParseGeoNearbyTitle( $expected, $value ) {
-               if ( ! class_exists( Coord::class ) ) {
-                       $this->markTestSkipped( 'GeoData extension must be 
installed' );
-                       return;
-               }
-
-               // Replace database with one that will return our fake 
coordinates if asked
-               $db = $this->getMock( IDatabase::class );
-               $db->expects( $this->any() )
-                       ->method( 'select' )
-                       ->with( 'geo_tags', $this->anything(), 
$this->anything(), $this->anything() )
-                       ->will( $this->returnValue( [
-                               (object) [ 'gt_lat' => 1.2345, 'gt_lon' => 
5.4321 ],
-                       ] ) );
-               // Tell LinkCache all titles not explicitly added don't exist
-               $db->expects( $this->any() )
-                       ->method( 'selectRow' )
-                       ->with( 'page', $this->anything(), $this->anything(), 
$this->anything() )
-                       ->will( $this->returnValue( false ) );
-               // Inject mock database into a mock LoadBalancer
-               $lb = $this->getMockBuilder( LoadBalancer::class )
-                       ->disableOriginalConstructor()
-                       ->getMock();
-               $lb->expects( $this->any() )
-                       ->method( 'getConnection' )
-                       ->will( $this->returnValue( $db ) );
-               $this->setService( 'DBLoadBalancer', $lb );
-
-               // Inject fake San Francisco page into LinkCache so it "exists"
-               MediaWikiServices::getInstance()->getLinkCache()
-                       ->addGoodLinkObj( 7654321, Title::newFromText( 'San 
Francisco' ) );
-               // Inject fake page with comma in it as well
-               MediaWikiServices::getInstance()->getLinkCache()
-                       ->addGoodLinkObj( 1234567, Title::newFromText( 
'Washington, D.C.' ) );
-
-               $config = $this->getMock( SearchConfig::class );
-               $config->expects( $this->any() )
-                       ->method( 'makeId' )
-                       ->will( $this->returnCallback( function ( $id ) {
-                               return $id;
-                       } ) );
-
-               // Finally run the test
-               $feature = new GeoFeature;
-               $result = $feature->parseGeoNearbyTitle( $config, $value );
-               if ( $result[0] instanceof Coord ) {
-                       $result[0] = [ 'lat' => $result[0]->lat, 'lon' => 
$result[0]->lon ];
-               }
-
-               $this->assertEquals( $expected, $result );
-       }
-}

-- 
To view, visit https://gerrit.wikimedia.org/r/326037
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Id08efd46337a977639ebf3724ee3492512f326ac
Gerrit-PatchSet: 1
Gerrit-Project: mediawiki/extensions/CirrusSearch
Gerrit-Branch: master
Gerrit-Owner: Smalyshev <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to