Re: don't miss this

2007-01-12 Thread karl wettin
13 jan 2007 kl. 06.02 skrev Erik Hatcher: I doubt Stefano minds if I pass this on here as food for thought... From: Stefano Mazzocchi <[EMAIL PROTECTED]> http://search.mpi-inf.mpg.de/ and their paper http://www-db.cs.wisc.edu/cidr/cidr2007/papers/P09.pdf I think it would be pretty cool t

Fwd: don't miss this

2007-01-12 Thread Erik Hatcher
I doubt Stefano minds if I pass this on here as food for thought... Begin forwarded message: From: Stefano Mazzocchi <[EMAIL PROTECTED]> Date: January 9, 2007 12:44:51 PM EST Subject: don't miss this http://search.mpi-inf.mpg.de/ and their paper http://www-db.cs.wisc.edu/cidr/cidr2007/papers

messy looking Searcher.search

2007-01-12 Thread karl wettin
Is it safe to change from public final Hits search(Query query) throws IOException { return search(query, (Filter)null); } public Hits search(Query query, Filter filter) throws IOException { return new Hits(this, query, filter); } public Hits search(Query query, Sort sort) thr

Re: publishing lucene 2.0.0 contribs to ibiblio?

2007-01-12 Thread Erik Hatcher
Anyone gonna tackle this? I personally have never used Maven or the POM stuff, and have never published our releases to the repository. Could this step be automated somehow? Volunteers? Erik On Jan 8, 2007, at 3:37 PM, Joerg Hohwiller wrote: Hi there, I am using lucene 2.0.0 i

[jira] Resolved: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2007-01-12 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved LUCENE-675. Resolution: Fixed Have committed a baseline benchmarking suite thanks to Doron and Andrzej.

[jira] Commented: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2007-01-12 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464410 ] Grant Ingersoll commented on LUCENE-675: Doron, I have committed your additions. This truly is great stuff

[jira] Commented: (LUCENE-771) Change default write lock file location to index directory (not java.io.tmpdir)

2007-01-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464409 ] Michael McCandless commented on LUCENE-771: --- You're right, backwards compatibility will allow a 2.1 client

Re: [jira] Commented: (LUCENE-140) docs out of order

2007-01-12 Thread Michael McCandless
Chris Hostetter wrote: : Also, we have other interesting Directory implementations : (MMapDirectory, DbDirectory, etc.); if we leave "create" at the : Directory layer then each of these places will have to replicate the : logic of "what it takes to cleanly create an index". you've convinced me .

[jira] Commented: (LUCENE-771) Change default write lock file location to index directory (not java.io.tmpdir)

2007-01-12 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464395 ] Doron Cohen commented on LUCENE-771: Is that true? I thought that for previous format changes, the combination of

[jira] Commented: (LUCENE-771) Change default write lock file location to index directory (not java.io.tmpdir)

2007-01-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464393 ] Michael McCandless commented on LUCENE-771: --- Yes, that is true. But there are also quite a few other chang

Re: [jira] Commented: (LUCENE-140) docs out of order

2007-01-12 Thread Chris Hostetter
: Also, we have other interesting Directory implementations : (MMapDirectory, DbDirectory, etc.); if we leave "create" at the : Directory layer then each of these places will have to replicate the : logic of "what it takes to cleanly create an index". you've convinced me ... the Directory implime

[jira] Commented: (LUCENE-550) InstantiatedIndex - faster but memory consuming index

2007-01-12 Thread Karl Wettin (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464386 ] Karl Wettin commented on LUCENE-550: Doug Cutting [12/Jan/07 10:16 AM] > I don't see a patch file here. Your pro

[jira] Updated: (LUCENE-550) InstantiatedIndex - faster but memory consuming index

2007-01-12 Thread Karl Wettin (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wettin updated LUCENE-550: --- Attachment: (was: lucene2karl-061122.tar.gz) > InstantiatedIndex - faster but memory consuming in

[jira] Updated: (LUCENE-550) InstantiatedIndex - faster but memory consuming index

2007-01-12 Thread Karl Wettin (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wettin updated LUCENE-550: --- Attachment: trunk.diff.bz2 > InstantiatedIndex - faster but memory consuming index >

[jira] Commented: (LUCENE-771) Change default write lock file location to index directory (not java.io.tmpdir)

2007-01-12 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464385 ] Doron Cohen commented on LUCENE-771: I have a question on this change - though I didn't look at the code yet - we

Re: [jira] Commented: (LUCENE-772) Lucene infinite loop? In FieldsReader.uncompress called from IndexSearcher.doc

2007-01-12 Thread robert engels
I think you are mistaken (in a sense). ZipInputStream is thread-safe in as much as an InputStream is thread- safe - which is isn't. I think what you are most likely seeing is that you are attempting to uncompress 'corrupted data' and the decompressor is getting confused - thus the hang.

[jira] Commented: (LUCENE-772) Lucene infinite loop? In FieldsReader.uncompress called from IndexSearcher.doc

2007-01-12 Thread Arthur Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464372 ] Arthur Smith commented on LUCENE-772: - Chuck - thanks - though IndexSearcher is supposed to be thread safe, so ma

[jira] Resolved: (LUCENE-771) Change default write lock file location to index directory (not java.io.tmpdir)

2007-01-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-771. --- Resolution: Fixed Thanks for the review Yonik! And thanks for pointing this out Marv

[jira] Commented: (LUCENE-772) Lucene infinite loop? In FieldsReader.uncompress called from IndexSearcher.doc

2007-01-12 Thread Arthur Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464359 ] Arthur Smith commented on LUCENE-772: - By the way, this runaway behavior did happen after we caught 1 i/o excepti

[jira] Commented: (LUCENE-772) Lucene infinite loop? In FieldsReader.uncompress called from IndexSearcher.doc

2007-01-12 Thread Chuck Williams (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464358 ] Chuck Williams commented on LUCENE-772: --- I had many concurrency problems with java.util.zip and ended up switch

Re: Beyond Lucene 2.0 Index Design

2007-01-12 Thread Paul Elschot
Gentlemen, On Friday 12 January 2007 21:00, Chuck Williams wrote: > > Doug Cutting wrote on 01/12/2007 09:49 AM: > > Marvin Humphrey wrote: > >> Can you show us some code or pseudo-code for a BooleanScorer that > >> would use impact-sorted posting lists? > > > > Another way to interpret this prop

[jira] Created: (LUCENE-772) Lucene infinite loop? In FieldsReader.uncompress called from IndexSearcher.doc

2007-01-12 Thread Arthur Smith (JIRA)
Lucene infinite loop? In FieldsReader.uncompress called from IndexSearcher.doc -- Key: LUCENE-772 URL: https://issues.apache.org/jira/browse/LUCENE-772 Project: Lucene - Java

[jira] Commented: (LUCENE-771) Change default write lock file location to index directory (not java.io.tmpdir)

2007-01-12 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464350 ] Yonik Seeley commented on LUCENE-771: - Sounds good, I agree with all the changes you outlined. > Change default

[jira] Commented: (LUCENE-769) [PATCH] Performance improvement for some cases of sorted search

2007-01-12 Thread Artem Vasiliev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464330 ] Artem Vasiliev commented on LUCENE-769: --- Btw I've integrated the modified fix into sharhound, 4000 documented s

Re: Beyond Lucene 2.0 Index Design

2007-01-12 Thread Chuck Williams
Doug Cutting wrote on 01/12/2007 09:49 AM: > Marvin Humphrey wrote: >> Can you show us some code or pseudo-code for a BooleanScorer that >> would use impact-sorted posting lists? > > Another way to interpret this proposal is index-only: the low-level > indexing APIs should be general enough to per

[jira] Created: (LUCENE-771) Change default write lock file location to index directory (not java.io.tmpdir)

2007-01-12 Thread Michael McCandless (JIRA)
Change default write lock file location to index directory (not java.io.tmpdir) --- Key: LUCENE-771 URL: https://issues.apache.org/jira/browse/LUCENE-771 Project: Lucene - Jav

Re: Beyond Lucene 2.0 Index Design

2007-01-12 Thread Doug Cutting
Marvin Humphrey wrote: Can you show us some code or pseudo-code for a BooleanScorer that would use impact-sorted posting lists? Another way to interpret this proposal is index-only: the low-level indexing APIs should be general enough to permit impact-sorted posting lists, and perhaps an impa

[jira] Commented: (LUCENE-769) [PATCH] Performance improvement for some cases of sorted search

2007-01-12 Thread Artem Vasiliev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464319 ] Artem Vasiliev commented on LUCENE-769: --- Refactored the fix according to Hoss's recomendations. Now only Store

[jira] Updated: (LUCENE-769) [PATCH] Performance improvement for some cases of sorted search

2007-01-12 Thread Artem Vasiliev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Vasiliev updated LUCENE-769: -- Attachment: selfContained.patch > [PATCH] Performance improvement for some cases of sorted sear

[jira] Updated: (LUCENE-769) [PATCH] Performance improvement for some cases of sorted search

2007-01-12 Thread Artem Vasiliev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Vasiliev updated LUCENE-769: -- Attachment: selfContained.patch > [PATCH] Performance improvement for some cases of sorted sear

[jira] Updated: (LUCENE-769) [PATCH] Performance improvement for some cases of sorted search

2007-01-12 Thread Artem Vasiliev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Vasiliev updated LUCENE-769: -- Attachment: selfContained.patch > [PATCH] Performance improvement for some cases of sorted sear

Re: Lockless commits -- great stuff!

2007-01-12 Thread Doug Cutting
Marvin Humphrey wrote: I'm writing a lot of KS 0.20 code with the notion that it will be submitted to Lucy [ ... ] Friendly reminder: if this is going to be eventually contributed to Apache, you need to make sure that all contributions can be under Apache's CLA. This would be simplest if yo

[jira] Commented: (LUCENE-550) InstantiatedIndex - faster but memory consuming index

2007-01-12 Thread Doug Cutting (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464287 ] Doug Cutting commented on LUCENE-550: - I don't see a patch file here. Your proposal would be easier to evaluate

Re: IndexWriter forceOptimize() ?

2007-01-12 Thread Yonik Seeley
On 1/11/07, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: Yeah, I actually had: public int segments() { return segmentInfos.size(); } in my IndexReader, but then erased it precisely because I thought this was exposing too much about the impl. That was my first instinct, but then again, we do e

Re: [jira] Commented: (LUCENE-665) temporary file access denied on Windows

2007-01-12 Thread Doron Cohen
Thanks Mike! I got your reply too late, closed it already. Since I opened this issue might be not too bad I guess, anyhow leaving it closed - I feel I made too much noise with this already. "Michael McCandless (JIRA)" <[EMAIL PROTECTED]> wrote on 12/01/2007 09:27:27: > > [ https://issues.apa

[jira] Commented: (LUCENE-665) temporary file access denied on Windows

2007-01-12 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464263 ] Doron Cohen commented on LUCENE-665: In case anyone else is looking for this - Jira "life cycle" under discussed

[jira] Closed: (LUCENE-665) temporary file access denied on Windows

2007-01-12 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen closed LUCENE-665. -- > temporary file access denied on Windows > --- > > Key

[jira] Resolved: (LUCENE-665) temporary file access denied on Windows

2007-01-12 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen resolved LUCENE-665. Resolution: Won't Fix With lockless commits this is no longer reproducable, and although theoretic

[jira] Commented: (LUCENE-665) temporary file access denied on Windows

2007-01-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464261 ] Michael McCandless commented on LUCENE-665: --- OK sounds good. Weird that the reply-to was my email. Normal

[jira] Commented: (LUCENE-665) temporary file access denied on Windows

2007-01-12 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464250 ] Doron Cohen commented on LUCENE-665: Hi Michael, Funny that I got this email with reply-to to you rather than th

Re: Lockless commits -- great stuff!

2007-01-12 Thread Yonik Seeley
On 1/12/07, Michael McCandless <[EMAIL PROTECTED]> wrote: Now that readers are read-only, I think it makes sense to default the write lock into the index directory, and as you describe, no longer generate a "unique namespace" hash lock ID since the index dir gives us that scoping. +1 Are ther

RE: Beyond Lucene 2.0 Index Design

2007-01-12 Thread Dalton, Jeffery
Thanks Grant, I will take a look at this. > -Original Message- > From: Grant Ingersoll [mailto:[EMAIL PROTECTED] > Sent: Thursday, January 11, 2007 8:12 AM > To: java-dev@lucene.apache.org > Subject: Re: Beyond Lucene 2.0 Index Design > > Hi Jeff, > > Wondering if you (and/or others)

RE: Beyond Lucene 2.0 Index Design

2007-01-12 Thread Dalton, Jeffery
Lucene is a combination of the vector space similarity and Boolean models. Lucene's queries a ranked Boolean query. Documents must meet certain Boolean criteria, but this list is then ranked by similarity score. If you didn't care about returning the "top" hits, then I would agree that the docId

RE: Beyond Lucene 2.0 Index Design

2007-01-12 Thread Dalton, Jeffery
The reason is performance on large collections. The common case is that users don't care what field they are searching -- they just want the most relevant results, fast! If you need to restrict querying to a certain field only (subject, URL, etc...) you still need to index that. However, for many

Re: [jira] Commented: (LUCENE-140) docs out of order

2007-01-12 Thread Michael McCandless
Chris Hostetter wrote: : I think we should deprecate the "create" argument to : FSDirectory.getDirectory(*) and leave only the create argument in : IndexWriter's constructors. Am I missing something? Is there are a : reason not to do this? i actual wonder about hte problem from the oposite dir

[jira] Commented: (LUCENE-665) temporary file access denied on Windows

2007-01-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464174 ] Michael McCandless commented on LUCENE-665: --- Doron can we close this issue now? I think native locking and

Re: Lockless commits -- great stuff!

2007-01-12 Thread Michael McCandless
Marvin Humphrey wrote: On Jan 11, 2007, at 6:48 AM, Michael McCandless wrote: I too am happy that we have no more commit lock :) Not just that. :) No more lock directory, since we can put write.lock in the index directory itself. No more lock file name munging, since lock files from dif