I tried to dig out everything and anything from my Lucene email folder that looked interesting and useful. I tried finding list archive URLs for all items, so that people could easily see previous discussions or get to contributed code, instead of starting the brain-storming process from scratch. Not all items have references, but that may be improved later. As we implement/add things to Lucene we can remove items from this list, and as we get feature requests that seem popular and useful we can append them to the list.
I just got sick of keeping all these emails, flagging them, and so on, and had to put this list of TO-DO items somewhere. I think others may find it useful, too. Otis --- [EMAIL PROTECTED] wrote: > otis 02/05/27 16:56:54 > > Added: . TODO.txt > Log: > - Lucene TO-DO items. > > Revision Changes Path > 1.1 jakarta-lucene/TODO.txt > > Index: TODO.txt > =================================================================== > $Revision: 1.1 $ > > LUCENE TO-DO ITEMS > > > - Term Vector support > c.f. > > http://nagoya.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgNo=273 > > http://nagoya.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgNo=272 > > - Support for Search Term Highlighting > c.f. > http://www.geocrawler.org/archives/3/2624/2001/9/50/6553088/ > > http://nagoya.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=115271 > > http://www.iq-computing.de/index.asp?menu=projekte-lucene-highlight > > http://nagoya.apache.org/eyebrowse/BrowseList?[EMAIL PROTECTED]&by=thread&from=56403 > > - Better support for hits sorted by things other than score. > An easy, efficient case is to support results sorted by the order > documents were > added to the index. A little harder and less efficient is > support for > results sorted by an arbitrary field. > c.f. > > http://nagoya.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=114756 > > http://www.mail-archive.com/[email protected]/msg00228.html > > - Add ability to "boost" individual documents/fields. > When a document is indexed, a numeric "boost" value could be > specified for the whole > document, and/or for individual fields. This value would be > multipled into > scores for hits on this document. This would facilitate the > implementation of > things like Google's PageRank. > c.f. > > http://nagoya.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=114749 > > http://nagoya.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=114757 > > - Add to FSDirectory the ability to specify where lock files live > and > to disable the use of lock files altogether (for read-only > media). > c.f. > > http://nagoya.apache.org/eyebrowse/BrowseList?[EMAIL PROTECTED]&by=thread&from=57011 > > - Add some requested methods: > String[] Document.getValues(String fieldName); > String[] IndexReader.getIndexedFields(); > void Token.setPositionIncrement(int); > c.f. > > http://nagoya.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=330010 > > http://nagoya.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=330009 > > - P�ter Hal�csy's changes to the QueryParser that make it possible > to > programmatically specify a default operator (OR or AND). > c.f. > > http://nagoya.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=115677 > > - The recenly submitted code that allows for queries such as > "Microsoft suc*" to match "Microsoft success" and "Microsoft > sucks". > c.f. > > http://nagoya.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=333275 > > - Make package protected abstract methods of > org.apache.lucene.search.Searcher > public (I'd like to be able to make subclasses of Searcher, > IndexWriter, InderReader). > c.f. > > http://www.mail-archive.com/cgi-bin/htsearch?method=and&format=short&config=lucene-dev_jakarta_apache_org&restrict=&exclude=&words=IndexAccessControl > > - Add lastModified() method to Directory, FSDirectory and > RamDirectory, so > it could be cached in IndexWriter/Searcher manager. > > - Support for adding more than 1 term to the same position. > N.B. I think the Finnish lady already implemented this. It > required some > pieces of Lucene to be modified. (OG). > > - The ability to retrieve the number of occurences not only for a > term > but also for a Phrase. > c.f. > > http://www.mail-archive.com/[email protected]/msg00101.html > > - Alex Murzaku contributed some code for dealing with Russian. > c.f. > > http://nagoya.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=115631 > > - A lady from Finland submitted code for handling Finnish. > > - Dutch stemmer, analyzer, etc. > c.f. > > http://nagoya.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgNo=145 > > - French stemmer, analyzer, etc. > c.f. > > http://nagoya.apache.org/eyebrowse/BrowseList?[EMAIL PROTECTED]&by=thread&from=56256 > > - Che Dong's CJKTokenizer for Chinese, Japanese, and Korean. > c.f. > > http://nagoya.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=330905 > > - Selecting a language-specific analyzer according to a locale. > Now we rewrite parts of lucene codes in order to use another > analyzer. It will be useful to select analyzer without touching > codes. > > - Adding "-encoding" option and encoding-sensitive methods to > tools. > Current tools needs minor changes on a Japanese (and other > language) > environment: adding an "-encode" option and argument, useing > Reader/Writer classes instead of InputStream/OutputStream > classes, etc. > > > $Revision: 1.1 $ > > > > > -- > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
