Two things:
1> did you re-index after you got your stopwords file set up? And I'd
blow away the index directory before re-indexing.
2> If you _store_ your field, the stopwords will be in your results
lists, but _not_ in your index. As a secondary
    check, try going into your admin/schema browser link and looking
at the field in question. Stopwords are
    by definition frequent so they should be at the top of your list.
3> Check a different way by using the TermsComponent (see:
http://wiki.apache.org/solr/TermsComponent/)
     this will also show you the _indexed_ as opposed to stored terms.

Best
Erick

On Mon, Jul 16, 2012 at 6:40 AM, Giovanni Gherdovich
<g.gherdov...@gmail.com> wrote:
> Hi all, thank you for your replies.
>
> Lance:
>> Look at the index with the Schema Browser in the Solr UI. This pulls
>> the terms for each field.
>
> I did it, and it was the first alarm I got.
> After the indexing, I went on the schema browser hoping
> to don't see any stopword in the top-terms, but...
> they were all there.
>
> Michael:
>> Hi Giovanni,
>>
>> you have entered the stopwords into stopword.txt file, right? But in the
>> definition of the field type you are referencing stopwords_FR.txt..
>
> good catch Micheal, but that's not the problem.
>
> In my message I referred to "stopwords.txt", but actually my
> stopwords file is named  stopwords_FR.txt, consistently with
> what I put in my schema.xml
>
> By the way, your answers make me think that yes,
> I have a problem: stopwords should not appear in the index.
>
> what a weird situation:
>
> * querying with SOLR for a stopword (say "and") gives me zero result
>   (so, somewhere in the indexing / searching pipeline my stopwords
> file *is* taken into account)
> * checking the index files with LuCLI for the same stopword give me
> tons of hits.
>
> cheers,
> GGhh

Reply via email to