Re: text format and scoring

2002-08-02 Thread Joshua O'Madadhain
On Sat, 3 Aug 2002, petite_abeille wrote: > I was wandering what would be a good way to incorporate text format > information in Lucene word/document scoring. For example, when turning > HTML into plain text for indexing purpose, a lot of potentially useful > information are lost: eg tags like

text format and scoring

2002-08-02 Thread petite_abeille
Hello, I was wandering what would be a good way to incorporate text format information in Lucene word/document scoring. For example, when turning HTML into plain text for indexing purpose, a lot of potentially useful information are lost: eg tags like , and so on could be understood as conve

Re: Full List of Stop Words for Standard Analyzer.

2002-08-02 Thread Doug Cutting
Ian Lea wrote: > In org/apache/lucene/analysis/standard/StandardAnalyzer.java. The source code for the current release is also on the website. In particular, this file is available as: http://jakarta.apache.org/lucene/src/java/org/apache/lucene/analysis/standard/StandardAnalyzer.java Doug

Re: Full List of Stop Words for Standard Analyzer.

2002-08-02 Thread Ian Lea
In org/apache/lucene/analysis/standard/StandardAnalyzer.java. -- Ian. [EMAIL PROTECTED] > [EMAIL PROTECTED] (Suneetha Rao) wrote > > Hi, > I would like to include in my documentation all the stop words > . > Can somebody tell me where to find the list for the Standard Analyzer ? > >