[jira] Commented: (LUCENE-848) Add supported for Wikipedia English as a corpus in the benchmarker stuff

2007-06-30 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509349 ] Grant Ingersoll commented on LUCENE-848: Committed. > Add supported for Wikipedia English as a corpus in the

[jira] Updated: (LUCENE-868) Making Term Vectors more accessible

2007-06-30 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-868: --- Attachment: LUCENE-868-v1.patch First pass at a patch on providing a callback mechanism for t

[jira] Commented: (LUCENE-848) Add supported for Wikipedia English as a corpus in the benchmarker stuff

2007-06-30 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509347 ] Grant Ingersoll commented on LUCENE-848: OK, I reran it and it went fine. Not sure what happened, but maybe

[EMAIL PROTECTED]: Project lucene-java (in module lucene-java) failed

2007-06-30 Thread Jason van Zyl
To whom it may engage... This is an automated request, but not an unsolicited one. For more information please visit http://gump.apache.org/nagged.html, and/or contact the folk at [EMAIL PROTECTED] Project lucene-java has an issue affecting its community integration. This issue affects

[EMAIL PROTECTED]: Project lucene-java (in module lucene-java) failed

2007-06-30 Thread Jason van Zyl
To whom it may engage... This is an automated request, but not an unsolicited one. For more information please visit http://gump.apache.org/nagged.html, and/or contact the folk at [EMAIL PROTECTED] Project lucene-java has an issue affecting its community integration. This issue affects

Re: search quality - assessment & improvements

2007-06-30 Thread Sean Timm
Is this the paper that you are refering to? A. Chowdhury, D. Grossman, O. Frieder, C. McCabe, "Document Normalization Revisited" , ACM-SIGIR, August 2002. http://ir.iit.edu/~abdur/publications/p381-chowdhury.pdf -Sean Doron Cohen wrote on 6/30/2007, 4:56 AM: > In particular for TREC > data,

Re: [jira] Created: (LUCENE-945) contrib/benchmark tests fail find data dirs

2007-06-30 Thread Doron Cohen
Grant Ingersoll <[EMAIL PROTECTED]> wrote on 30/06/2007 05:20:34: > Does this imply it is going to download the test collection for > people when they don't have it when running tests? I don't know if > that is something people are going to want to happen. Yes it does.. and so would auto-build-b

[jira] Updated: (LUCENE-848) Add supported for Wikipedia English as a corpus in the benchmarker stuff

2007-06-30 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-848: --- Attachment: LUCENE-848-build.patch Here's a patch to just the build.xml that downloads from p

[jira] Commented: (LUCENE-848) Add supported for Wikipedia English as a corpus in the benchmarker stuff

2007-06-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509289 ] Michael McCandless commented on LUCENE-848: --- Just to add another datapoint: I've been using the wikipedia s

[jira] Commented: (LUCENE-848) Add supported for Wikipedia English as a corpus in the benchmarker stuff

2007-06-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509288 ] Michael McCandless commented on LUCENE-848: --- > I think Mike mentioned not doing the one file per article. I

Re: [jira] Created: (LUCENE-945) contrib/benchmark tests fail find data dirs

2007-06-30 Thread Grant Ingersoll
Does this imply it is going to download the test collection for people when they don't have it when running tests? I don't know if that is something people are going to want to happen. -Grant On Jun 27, 2007, at 1:26 AM, Doron Cohen (JIRA) wrote: contrib/benchmark tests fail find data dir

Re: search quality - assessment & improvements

2007-06-30 Thread Doron Cohen
Doug Cutting wrote: > We should be careful not to tune things too much for any one application > and/or dataset. Tools to perform evaluation would clearly be valuable. > But changes that improve Lucene's results on TREC data may or may not > be of general utility. The best way to tune an appli