[ 
https://issues.apache.org/jira/browse/LUCENE-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13058187#comment-13058187
 ] 

Chris Male edited comment on LUCENE-3271 at 7/1/11 2:39 AM:
------------------------------------------------------------

{quote}
bq. similar.* -> suggest module

Seems a little funky? I guess if we had a query-expansion module, I would think 
it belonged there.
{quote}

MoreLikeThis makes suggestions :D But okay.  Do you have other thoughts for 
what could go into a query-expansion module?  If so, then I'll go with it.  I 
just know that MLT doesn't belong with the queries anymore.

{quote}
{quote}
FieldCacheRewriteMethod -> This doesn't belong in this contrib or the queries 
module. I think we should push it to contrib/misc for the time being. It seems 
to have quite a few constraints on when its useful. If indeed 
CONSTANT_SCORE_AUTO rewrite is better, then I dont see a purpose for it.
{quote}

My vote would actually be to move this to src/test!  Yeah there are some 
scenarios where this thing could be faster, but really I thought it was just a 
good way to add seek to the doctermsindex termsenum. I do think it and its test 
would be a nice addition to src/test, if someone wants to use it the can always 
snag it from there... its that expert.
{quote}

src/test it is.

{quote}
In my opinion, as a rewrite method (i think it would require 2, one for the 
variant that ignores TF), we could get better performance out of this with 
cleaner code... in other words you would just use ordinary FuzzyQuery and set 
this rewrite method for its scoring heuristic, or a BQ of FuzzyQueries if you 
are doing the expansion thing
{quote}

So what are you suggesting? We could sandbox it for the time being (see my 
comments about sandbox below).

{quote}
Finally, I wanted to say that its my opinion that we shouldn't put garbage in 
modules. Modules should be treated like core I think.... yet at the same time I 
totally support efforts to cleanup contrib, either removing sandy stuff or 
refactoring it where it belongs in a module.
{quote}

+1 to all this.  I'm going to do a code cleanup on each of the classes to goes 
into the module. Test coverage will be looked into as well. At this stage I 
don't think any of the classes I've suggested moving to would be deemed garbage.

{quote}
One option could to create a sandbox directory either under lucene (it contains 
src/java and src/test but is totally an unorganized sandbox), or itself as a 
contrib temporarily (contrib/sandbox) and take a look at contrib and move stuff 
thats good into modules, but toss all the 'odd things' into this sandbox.
{quote}

I actually really like the idea of a sandbox.  I think for simplicity, its best 
to make it a contrib.  That way we can easily get it up and running.  It also 
won't 'stain' anything that isn't already stained.

As part of this work, I'll push the SlowCollated* stuff to the sandbox, along 
with FuzzyLikeThis.

      was (Author: cmale):
    {quote}
bq. similar.* -> suggest module

Seems a little funky? I guess if we had a query-expansion module, I would think 
it belonged there.
{quote}

MoreLikeThis makes suggestions :D But okay.  Do you have other thoughts for 
what could go into a query-expansion module?  If so, then I'll go with it.  I 
just know that MLT doesn't belong with the queries anymore.

{quote}
{quote}
FieldCacheRewriteMethod -> This doesn't belong in this contrib or the queries 
module. I think we should push it to contrib/misc for the time being. It seems 
to have quite a few constraints on when its useful. If indeed 
CONSTANT_SCORE_AUTO rewrite is better, then I dont see a purpose for it.
{quote}

My vote would actually be to move this to src/test!  Yeah there are some 
scenarios where this thing could be faster, but really I thought it was just a 
good way to add seek to the doctermsindex termsenum. I do think it and its test 
would be a nice addition to src/test, if someone wants to use it the can always 
snag it from there... its that expert.
{quote}

src/test it is.

{quote}
In my opinion, as a rewrite method (i think it would require 2, one for the 
variant that ignores TF), we could get better performance out of this with 
cleaner code... in other words you would just use ordinary FuzzyQuery and set 
this rewrite method for its scoring heuristic, or a BQ of FuzzyQueries if you 
are doing the expansion thing
{quote}

So what are you suggesting? We could sandbox it for the time being (see my 
comments about sandbox below).

{quote}
Finally, I wanted to say that its my opinion that we shouldn't put garbage in 
modules. Modules should be treated like core I think.... yet at the same time I 
totally support efforts to cleanup contrib, either removing sandy stuff or 
refactoring it where it belongs in a module.
{quote}

+1 to all this.  I'm going to do a code cleanup on each of the classes to goes 
into the module. Test coverage will be looked into as well. At this stage I 
don't think any of the classes I've suggested moving to would be deemed garbage.

{code}
One option could to create a sandbox directory either under lucene (it contains 
src/java and src/test but is totally an unorganized sandbox), or itself as a 
contrib temporarily (contrib/sandbox) and take a look at contrib and move stuff 
thats good into modules, but toss all the 'odd things' into this sandbox.
{code}

I actually really like the idea of a sandbox.  I think for simplicity, its best 
to make it a contrib.  That way we can easily get it up and running.  It also 
won't 'stain' anything that isn't already stained.

As part of this work, I'll push the SlowCollated* stuff to the sandbox, along 
with FuzzyLikeThis.
  
> Move 'good' contrib/queries classes to Queries module
> -----------------------------------------------------
>
>                 Key: LUCENE-3271
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3271
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Chris Male
>
> With the Queries module now filled with the FunctionQuery stuff, we should 
> look at closing down contrib/queries.  While not a huge contrib, it contains 
> a number of pretty useful classes and some that should go elsewhere.
> Heres my proposed plan:
> - similar.* -> suggest module
> - regex.* -> queries module
> - BooleanFilter -> queries module under .filters package
> - BoostingQuery -> queries module
> - ChainedFilter -> queries module under .filters package
> - DuplicateFilter -> queries module under .filters package
> - FieldCacheRewriteMethod -> This doesn't belong in this contrib or the 
> queries module.  I think we should push it to contrib/misc for the time 
> being.  It seems to have quite a few constraints on when its useful.  If 
> indeed CONSTANT_SCORE_AUTO rewrite is better, then I dont see a purpose for 
> it.
> - FilterClause -> class inside BooleanFilter
> - FuzzyLikeThisQuery -> suggest module. This class seems a mess with its 
> Similarity hardcoded.  With all that said, it does seem to do what it claims 
> and with some cleanup, it could be good.
> - TermsFilter -> queries module under .filters package
> - SlowCollated* -> They can stay in the module till we have a better place to 
> nuke them.
> One of the implications of the above moves, is that the xml-query-parser, 
> which supports many of the queries, will need to have a dependency on the 
> queries module.  But that seems unavoidable at this stage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to