[ 
https://issues.apache.org/jira/browse/SOLR-295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511346
 ] 

Pieter Berkel commented on SOLR-295:
------------------------------------

Thanks Ryan, I missed that original thread mentioned in SOLR-281 but completely 
agree with the line of thinking and proposals, (actually I was thinking the 
same when I made the above patch).  There is little point in duplicating code 
across request handlers (leading to code bloat as you suggested), refactoring 
common functionality in separate components is going to ensure consistency in 
the response format across all handlers.

I'll take a look at the patch submitted on SOLR-281 and see what I can do in 
terms of implementing my MLT ideas, however until the 'search component' 
framework concept has really been 'solidified', I'm afraid it's going to be 
difficult to extend.

regards,
Pieter

> Implementing MoreLikeThis support in DismaxRequestHandler
> ---------------------------------------------------------
>
>                 Key: SOLR-295
>                 URL: https://issues.apache.org/jira/browse/SOLR-295
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.3
>            Reporter: Pieter Berkel
>            Priority: Minor
>         Attachments: MoreLikeThis-DismaxRequestHandler_SOLR-295.patch
>
>
> There's nothing too clever about this initial patch to be upload shortly, I 
> have simply extracted the MLT code from the StandardRequestHandler and 
> inserted it into the DismaxRequestHandler.  However, there are some broader 
> MLT issues that I'd also like to address in the near future:
> 1) (trivial) No "This response format is experimental" warning when MLT is 
> used with StandardRequestHandler (or DismaxRequestHandler).  Not really a big 
> deal but at least makes developers aware of the possibility of future changes.
> 2) (trivial) "org.apache.solr.common.util.MoreLikeThisParams" should perhaps 
> be moved to the more appropriate package "org.apache.solr.common.params".
> 3) (non-trivial) The ability to specify the list of fields that should be 
> returned when MLT is invoked from an external handler (i.e. 
> StandardRequestHandler).  Currently the field list (FL) parameter is 
> inherited from the main query but I can envisage cases where it would be 
> desirable to specify more or less return fields in the MLT query than the 
> main query.  One complication is that "mlt.fl" is already used to specify the 
> fields used for similarity.  Perhaps "mlt.fl" is not the best name for this 
> parameter and should be renamed to avoid potential conflict / confusion?
> 4) (fairly-trivial) On a similar note to 3, there is currently no way to 
> specify a "start" value for the rows returned when MLT is invoked from an 
> external handler (e.g. StandardRequestHandler), it is hard-coded to 0 (i.e. 
> the first "mlt.count" documents matched).  While I can see the logic in 
> naming the parameter "mlt.count", it does seem a little inconsistent and 
> perhaps it would be better to rename (or at least alias) it to "mlt.rows" to 
> be consistent with the CommonQueryParameters.  Note that "mlt.start" is 
> fundamentally different to the "mlt.match.offset" parameter as the later 
> deals with documents *matching* the initial MLT query while the former deals 
> with documents *returned* by the MLT query (hope that makes sense).
> I have created a patch that implemented "mlt.start" (to specify the start 
> doc) and added "mlt.rows" that could be used interchangeably with "mlt.count" 
> (but I would prefer to remove "mlt.count" altogether), but since it involves 
> changing the method definition of MoreLikeThisHelper.getMoreLikeThese(), I 
> wanted to get some opinions before submitting it.
> 5) (non-trivial) Interesting Terms - the ability to return interesting term 
> information using the "mlt.interestingTerms" parameter when MLT is invoked 
> from an external handler.  This is perhaps the most useful feature I am 
> looking to implement, I can see great benefit in being able to provide a list 
> of interesting terms or "keywords" for each document returned in a standard 
> or dismax query.  Currently this only available from the MLT request handler 
> so perhaps the best approach would be to re-factor the "interestingTerms" 
> code in MoreLikeThisHandler class and put it somewhere in MoreLikeThisHelper 
> so it is available to all handlers?  Again, I would appreciate any comments 
> or suggestions.
> I've also noted the MLT features suggested by Tristan [ 
> http://www.nabble.com/MoreLikeThis-with-DisMax-boost-query---functions-tf4047187.html
>  ] which could quite possibly be rolled together with the above points -- I'm 
> not sure whether is is better to have a single ticket tracking several 
> related issues or create invididual tickets for each issue, however will be 
> happy to comply with the Solr issue tracking policy on advice from the core 
> developers.
> regards,
> Pieter

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to