Re: Comparable ScoreDoc

2007-12-07 Thread Chris Hostetter
: In general I would agree that people may want different implementations for : compare(), but I hardly see that's the case for ScoreDoc. After all, you can : either compare it by score or by doc (at least now). I believe that since : most people use the TopDocsHitCollector, they prefer the compar

Re: Comparable ScoreDoc

2007-12-07 Thread Shai Erera
In general I would agree that people may want different implementations for compare(), but I hardly see that's the case for ScoreDoc. After all, you can either compare it by score or by doc (at least now). I believe that since most people use the TopDocsHitCollector, they prefer the compare-by-scor

Re: O/S Search Comparisons

2007-12-07 Thread Mark Miller
Did it crash on the 10 GB? I thought it said that it just took way to long (7 times the best or something). Frankly, either case is suspect. Last summer I indexed about 5 million docs with a total size at the *very* least of 10 GB on my 3 year old desktop. It didn't take much more than 8 hours

Re: O/S Search Comparisons

2007-12-07 Thread Grant Ingersoll
All true and good points. Lucene held up quite nicely in the search aspect (at least perf. wise) and I generally don't think making these kinds of comparisons are all that useful (we call it apple and oranges in English :-) ). What I am trying to get at is if this paper was just about Luc

Re: SpellChecker in 2.2.0

2007-12-07 Thread Otis Gospodnetic
Any unclosed and unused searcher that doesn't get closed will simply get garbage collected when its time is up and when the GC gets to it. Are you seeing problems with the spellchecker? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: sujit

RE: O/S Search Comparisons

2007-12-07 Thread Samir Abdou
There is an expression in French that says "comparer des pommes et des poires" which literally means "to compare apples and pears". That's what this paper is about. For my point of view, such a comparison would be interesting only if a cross analysis of different criterions (for example, retrieval

[jira] Created: (LUCENE-1084) increase default maxFieldLength?

2007-12-07 Thread Daniel Naber (JIRA)
increase default maxFieldLength? Key: LUCENE-1084 URL: https://issues.apache.org/jira/browse/LUCENE-1084 Project: Lucene - Java Issue Type: Improvement Components: Index Affects Versions: 2.2

Re: O/S Search Comparisons

2007-12-07 Thread Mark Miller
Yes, and even if they did not use the stock defaults, I would bet there would be complaints about what was done wrong at every turn. This seems like a very difficult thing to do. How long does it take to fully learn how to correctly utilize each search engine for the task at hand? I am sure lon

Re: O/S Search Comparisons

2007-12-07 Thread Mike Klaas
There is a good chance that they were using stock indexing defaults, based on: Lucene: " In the present work, the simple applications bundled with the library were used to index the collection. " On 7-Dec-07, at 10:27 AM, Grant Ingersoll wrote: Yeah, I wasn't too excited over it and I certain

[jira] Commented: (LUCENE-1083) JDiff report of changes between different versions of Lucene

2007-12-07 Thread Matt Doar (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12549526 ] Matt Doar commented on LUCENE-1083: --- As an aside, Maven repositories in general could usefully be enhanced to reco

Re: O/S Search Comparisons

2007-12-07 Thread Grant Ingersoll
Yeah, I wasn't too excited over it and I certainly didn't lose any sleep over it, but there are some interesting things of note in there concerning Lucene, including the claim that it fell over on indexing WT10g docs (page 40) and I am always looking for ways to improve things. Overall, I

[jira] Commented: (LUCENE-1083) JDiff report of changes between different versions of Lucene

2007-12-07 Thread Doug Cutting (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12549500 ] Doug Cutting commented on LUCENE-1083: -- The "prior release" is a new concept that needs to be added to the buil

[jira] Commented: (LUCENE-1083) JDiff report of changes between different versions of Lucene

2007-12-07 Thread Matt Doar (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12549475 ] Matt Doar commented on LUCENE-1083: --- Grant, I was imagining more that the release process for Lucene could be cha

Re: O/S Search Comparisons

2007-12-07 Thread robert engels
I wouldn't get too excited over this. Once again, it does not seem the evaluator understands the nature of GC based systems, and the memory statistics are quite out of whack. But it is hard to tell because there is no data on how memory consumption was actually measured. A far better way

[jira] Commented: (LUCENE-1083) JDiff report of changes between different versions of Lucene

2007-12-07 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12549429 ] Grant Ingersoll commented on LUCENE-1083: - Thanks, Matt. I assume the antjdiff.jar needs to be included som

O/S Search Comparisons

2007-12-07 Thread Grant Ingersoll
Was wondering if people have seen http://wrg.upf.edu/WRG/dctos/Middleton-Baeza.pdf Has some interesting comparisons. Obviously, the comparison of Lucene indexing is done w/ 1.9 so it probably needs to be done again. Just wondering if people see any opportunities to improve Lucene from it

[jira] Resolved: (LUCENE-1077) New Analysis Contributions

2007-12-07 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved LUCENE-1077. - Resolution: Fixed Lucene Fields: (was: [New]) Committed > New Analysis Contri

[jira] Resolved: (LUCENE-1082) IndexReader.lastModified - throws NPE

2007-12-07 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1082. Resolution: Fixed Fix Version/s: 2.3 I just committed this. Thanks Alan!