Hello Hatem,
atm the list of stop words is defined in DSAnalyzer
see
protected static final String[] STOP_WORDS =
{
// new stopwords (per MargretB)
"a", "am", "and", "are", "as", "at", "be", "but", "by", "for",
"if", "in", "into", "is", "it", "no", "not", "of", "on", "or",
"the", "to", "was"
...
};
Hope this helps
Claudia Jürgen
Am 09.08.2012 16:09, schrieb Hatem Jlassi:
> Hi Emilio,
>
> Thanks for your response, I added this code in DSAnalyzer.java file and
> rebuild Dspace.
> import org.apache.lucene.analysis.ASCIIFoldingFilter;
> result = new ASCIIFoldingFilter(result);
>
> It works now for search with accented characters, but how to remove a French
> stop words from indexes. Actually when to search a French stop words like.
> (Le, La, De, Dans), it displays all records that contain these words. It just
> removes the English stop words.
>
> Regards,
>
>
> De : emilio lorenzo [mailto:[email protected]]
> Envoyé : 9 août 2012 03:18
> À : Hatem Jlassi; [email protected]
> Objet : Re: [Dspace-tech] Searching : Diacritics & Indexing
>
> Hi,
>
> The class ISOLatin1AccentFilter has been deprecated by Lucene (although still
> can be found...) and substitued by ASCIIFoldingFilter class
> For english + latin languages installations , we suggest the following
> org.dspace.search.DSAnalyzer configuration (keep the order, is relevant for
> the searcher):
>
> import org.apache.lucene.analysis.ASCIIFoldingFilter;
> ..
> ..
> result = new StandardFilter(result);
> result = new LowerCaseFilter(result);
> result = new StopFilter(result, stopSet);
> result = new ASCIIFoldingFilter(result);
> result = new PorterStemFilter(result);
>
> Anyway, org.dspace.search.DSAnalyzer corresponds to Lucene configuration.
> SOLR conf is quite different.
> Best Luck.
> Emilio
>
>
> El 08/08/2012 20:14, Hatem Jlassi escribió:
> Hi all,
>
> We are running a bilingual (French/English) instance of last version of
> Dspace (1.8.2). We have some problems with the search with diacritics. The
> Dspace's searcher doesn't find words with accented characters when the search
> doesn't include these accents.
> We modified
> (\dspace-1.8.2-src-release\dspace-api\src\main\java\org\dspace\search\DSAnalyzer.java)
> and we added the followings two lines:
> ISOLatin1AccentFilter;
> result = new ISOLatin1AccentFilter(result);
> Rebuild, Re-index Dspace
> But the problem was not resolved.
>
> If anyone has solved this problem - Please Help!!! Thank You
>
> Regards,
>
>
>
>
>
>
> ------------------------------------------------------------------------------
>
> Live Security Virtual Conference
>
> Exclusive live event will cover all the ways today's security and
>
> threat landscape has changed and how IT managers can respond. Discussions
>
> will include endpoint security, mobile security and the latest in malware
>
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>
>
>
>
>
> _______________________________________________
>
> DSpace-tech mailing list
>
> [email protected]<mailto:[email protected]>
>
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>
>
>
> _______________________________________________
> DSpace-tech mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>
--
Claudia Juergen
Universitaetsbibliothek Dortmund
Eldorado
0231/755-4043
https://eldorado.tu-dortmund.de/
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech