jenkins-bot has submitted this change and it was merged.

Change subject: Improve highlighting for phrase_prefix queries
......................................................................


Improve highlighting for phrase_prefix queries

This fixes addresses https://phabricator.wikimedia.org/T93014 to
enable highlighting in the title, etc., as well as for compound
searches including things other than phrase_prefix queries.

This relies on a sorta hacky query_string query for the highlighter,
because when we use a match query only the title gets highlighted.

Change-Id: I80cbd3c7daacad8680f2a32ab91fac12d915a7ed
---
M includes/Searcher.php
M tests/browser/features/phrase_prefix.feature
M tests/browser/features/support/hooks.rb
3 files changed, 27 insertions(+), 2 deletions(-)

Approvals:
  Manybubbles: Looks good to me, approved
  jenkins-bot: Verified



diff --git a/includes/Searcher.php b/includes/Searcher.php
index 2c0ba8a..7db3da6 100644
--- a/includes/Searcher.php
+++ b/includes/Searcher.php
@@ -192,6 +192,12 @@
        private $nonTextQueries = array();
 
        /**
+        * @var array queries that don't use Elastic's "query string" query, 
for more
+        * advanced highlighting (e.g. match_phrase_prefix for regular quoted 
strings).
+        */
+       private $nonTextHighlightQueries = array();
+
+       /**
         * Constructor
         * @param int $offset Offset the results by this much
         * @param int $limit Limit the results to this many
@@ -579,7 +585,13 @@
                                        $phraseMatch->setFieldQuery( 
"all.plain", $matches[1] );
                                        $phraseMatch->setFieldType( 
"all.plain", "phrase_prefix" );
                                        $this->nonTextQueries[] = $phraseMatch;
-                                       return array( );
+
+                                       $phraseHighlightMatch = new 
Elastica\Query\QueryString( );
+                                       $phraseHighlightMatch->setQuery( 
$matches[1] . '*' );
+                                       $phraseHighlightMatch->setFields( 
array( 'all.plain' ) );
+                                       $this->nonTextHighlightQueries[] = 
$phraseHighlightMatch;
+
+                                       return array();
                                }
 
                                if ( !isset( $matches[ 'fuzzy' ] ) ) {
@@ -960,6 +972,18 @@
                                        return $field[ 'type' ] !== 'plain';
                                });
                        }
+                       if ( sizeof( $this->nonTextHighlightQueries ) > 0 ) {
+                               // We have some phrase_prefix queries, so let's 
include them in the
+                               // generated highlight_query.
+                               $bool = new \Elastica\Query\Bool();
+                               if ( $this->highlightQuery ) {
+                                       $bool->addShould( $this->highlightQuery 
);
+                               }
+                               foreach ( $this->nonTextHighlightQueries as 
$nonTextHighlightQuery ) {
+                                       $bool->addShould( 
$nonTextHighlightQuery );
+                               }
+                               $this->highlightQuery = $bool;
+                       }
                        if ( $this->highlightQuery ) {
                                $highlight[ 'highlight_query' ] = 
$this->highlightQuery->toArray();
                        }
diff --git a/tests/browser/features/phrase_prefix.feature 
b/tests/browser/features/phrase_prefix.feature
index 3257d11..ae73606 100644
--- a/tests/browser/features/phrase_prefix.feature
+++ b/tests/browser/features/phrase_prefix.feature
@@ -6,3 +6,4 @@
   Scenario: Simple quoted prefix phrases get results
     When I search for "functional p*"
     Then Functional programming is the first search result
+      And *Functional* *programming* is referential transparency. is the 
highlighted text of the first search result
diff --git a/tests/browser/features/support/hooks.rb 
b/tests/browser/features/support/hooks.rb
index 61277e5..41115a9 100644
--- a/tests/browser/features/support/hooks.rb
+++ b/tests/browser/features/support/hooks.rb
@@ -14,7 +14,7 @@
       And a page named Two Words exists with contents ffnonesenseword catapult 
{{Template_Test}} anotherword [[Category:TwoWords]] [[Category:Categorywith 
Twowords]] [[Category:Categorywith " Quote]]
       And a page named AlphaBeta exists with contents [[Category:Alpha]] 
[[Category:Beta]]
       And a page named IHaveATwoWordCategory exists with contents 
[[Category:CategoryWith ASpace]]
-      And a page named Functional programming exists
+      And a page named Functional programming exists with contents Functional 
programming is referential transparency.
       And a page named वाङ्मय exists
       And a page named वाङ्‍मय exists
       And a page named वाङ‍्मय exists

-- 
To view, visit https://gerrit.wikimedia.org/r/201350
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I80cbd3c7daacad8680f2a32ab91fac12d915a7ed
Gerrit-PatchSet: 3
Gerrit-Project: mediawiki/extensions/CirrusSearch
Gerrit-Branch: master
Gerrit-Owner: Jdouglas <[email protected]>
Gerrit-Reviewer: Chad <[email protected]>
Gerrit-Reviewer: Jdouglas <[email protected]>
Gerrit-Reviewer: Manybubbles <[email protected]>
Gerrit-Reviewer: jenkins-bot <>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to