[ 
https://issues.apache.org/jira/browse/SOLR-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12806833#action_12806833
 ] 

Robert Muir commented on SOLR-1670:
-----------------------------------

Steven, i don't have a problem with your patch (I do not wish to be in the way 
of anyone trying to work on SynonymFilter)

But i want to explain some of where i was coming from.

The main reason i got myself into this mess was to try to add wordnet support 
to solr. However, this is currently not possible without duplicating a lot of 
code.
We need to be really careful about allowing any order, it does matter in some 
situations.
For example, in Lucene's synonymfilter (with wordnet support), it has an option 
to limit the number of expansions (so its like a top-N synonym expansion).
Solr doesnt currently have this, so its N/A for now, but just an example where 
the order suddenly becomes important.

only slightly related: we added some improvements to this assertion in lucene 
recently and found a lot of bugs, better checking for clearAttribute() and end()
at some I would like to port these test improvements over to solr, too. 


> synonymfilter/map repeat bug
> ----------------------------
>
>                 Key: SOLR-1670
>                 URL: https://issues.apache.org/jira/browse/SOLR-1670
>             Project: Solr
>          Issue Type: Bug
>          Components: Schema and Analysis
>    Affects Versions: 1.4
>            Reporter: Robert Muir
>            Assignee: Yonik Seeley
>         Attachments: SOLR-1670.patch, SOLR-1670.patch, SOLR-1670_test.patch
>
>
> as part of converting tests for SOLR-1657, I ran into a problem with 
> synonymfilter
> the test for 'repeats' has a flaw, it uses this assertTokEqual construct 
> which does not really validate that two lists of token are equal, it just 
> stops at the shorted one.
> {code}
>     // repeats
>     map.add(strings("a b"), tokens("ab"), orig, merge);
>     map.add(strings("a b"), tokens("ab"), orig, merge);
>     assertTokEqual(getTokList(map,"a b",false), tokens("ab"));
>     /* in reality the result from getTokList is ab ab ab!!!!! */
> {code}
> when converted to assertTokenStreamContents this problem surfaced. attached 
> is an additional assertion to the existing testcase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to