Re: [OT, slightly] Some interesting metrics on Lucene

2007-07-08 Thread Ian Holsman
Grant Ingersoll wrote: http://www.ohloh.net/projects/3564 has some interesting metrics on Lucene (and Solr and Nutch). Most interesting is that they estimate it is 34 person years to develop at a cost of approximately $1.8 million dollars (using a salary of $55k) before you get too excited,

Hudson build is back to normal: Lucene-Nightly #145

2007-07-08 Thread hudson
See http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/145/changes - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: search quality - assessment & improvements

2007-07-08 Thread Chris Hostetter
: Thanks for your comments Chris, and sorry for the delayed my turn for a delayed response ... i figured there was no rush since you were offline for 10 days :) : I didn't try this - passing the computed avg doc length to : SweetSpotSimilarity (SSS) - it would be interesting to try. I wonder : h

[OT, slightly] Some interesting metrics on Lucene

2007-07-08 Thread Grant Ingersoll
http://www.ohloh.net/projects/3564 has some interesting metrics on Lucene (and Solr and Nutch). Most interesting is that they estimate it is 34 person years to develop at a cost of approximately $1.8 million dollars (using a salary of $55k) One thing they do not do in their analysis is loo

Re: for a better spellchecker

2007-07-08 Thread Chris Hostetter
: Now, SpellChecker use the trigram algorithm to find similar words. It : works well for keyboard fumbles, but not well enough for short words : and for languages like french where a same sound can be wrote : differently. : Spellchecking is a classical computer task, and aspell provides some : nic

[jira] Created: (LUCENE-953) Snowball has new Stemmers available

2007-07-08 Thread Grant Ingersoll (JIRA)
Snowball has new Stemmers available --- Key: LUCENE-953 URL: https://issues.apache.org/jira/browse/LUCENE-953 Project: Lucene - Java Issue Type: Improvement Components: Analysis Reporter:

[jira] Resolved: (LUCENE-848) Add supported for Wikipedia English as a corpus in the benchmarker stuff

2007-07-08 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved LUCENE-848. Resolution: Fixed This has been committed > Add supported for Wikipedia English as a corpu