RE: Advantage of putting lucene index in RDBMS

2006-10-11 Thread sachin
I feel implementing the Lucene inside the RDBMS is nothing but implementation of following interfaces : TermDocs TermVector TermPositions -Original Message- From: Karel Tejnora [mailto:[EMAIL PROTECTED] Sent: Friday, October 06, 2006 4:11 PM To: java-user@lucene.apache.org Subject: Re:

How to fire a query ?

2006-10-11 Thread Bhavin Pandya
Hi guys, How to fire digital camera when someone fire digital cam .. ? Do i need to make manual list for such items and look up at search time or theree is any better way to do this... -Bhavin pandya - To unsubscribe,

Re: Incremental updates / slow searches.

2006-10-11 Thread Rickard Bäckman
Thanks for the suggestions. We tried to reduce the amount of times we open a new searcher with some progress. However a lot of our searches still times out. We are currently opening a new searcher and warms it up before doing the switch. We even map the fields we are using for deleting to the

Re: How to fire a query ?

2006-10-11 Thread Erick Erickson
You might have luck with one of the stemming analyzers, both at index time and at search time. Do note that stemmers have their own quirks. It's not clear that they would transform camera into cam, for instance. Other than that, I don't know how to get what you want. Perhaps you could provide

Distinct search

2006-10-11 Thread Eugeny N Dzhurinsky
Hi there! I have a index structure like this: document_id some_text . when searching for some set of documents, there could be a case when several comments for the same document match the search criteria. In such case I need to get single hit for all such cases, in other word - perform a

Re: Distinct search

2006-10-11 Thread Erick Erickson
There's no real group_by functionality in Lucene. I'd have to ask, though, why organize your index this way? I'm guessing that you're approaching this from a database perspective, and if that's so, you may want to re-think some things. Although see below for my contradicting myself. Lucene

Re: Distinct search

2006-10-11 Thread Eugeny N Dzhurinsky
On Wed, Oct 11, 2006 at 11:30:03AM -0400, Erick Erickson wrote: There's no real group_by functionality in Lucene. I'd have to ask, though, why organize your index this way? I'm guessing that you're approaching this from a database perspective, and if that's so, you may want to re-think some

Re: Distinct search

2006-10-11 Thread Erick Erickson
No problem. Partly, it's helping me clarify my current problem G Yes, you must delete and re-add a document to change it. You might want to look at the IndexModifier class. Be aware of some things: 1 Lucene doc IDs may change when the index is changed, I think after optimization. So, in

Re: Distinct search

2006-10-11 Thread Eugeny N Dzhurinsky
On Wed, Oct 11, 2006 at 12:09:40PM -0400, Erick Erickson wrote: No problem. Partly, it's helping me clarify my current problem G Yes, you must delete and re-add a document to change it. You might want to look at the IndexModifier class. Be aware of some things: 1 Lucene doc IDs may

Re: wildcard and span queries

2006-10-11 Thread Erick Erickson
Problem 3482: I'm probably close to being able to start work. Except... How to count hits with SrndQuery? Or, more generally, with arbitrary wildcards and boolean operators? So, say I've indexed a book by page. That is, each page is a document. I know a particular page matches my query because

Re: wildcard and span queries

2006-10-11 Thread Erik Hatcher
Erick - what about using getSpans() from the SpanQuery that is generated? That should give you what you're after I think. Erik On Oct 11, 2006, at 2:17 PM, Erick Erickson wrote: Problem 3482: I'm probably close to being able to start work. Except... How to count hits with

Re: wildcard and span queries

2006-10-11 Thread Paul Elschot
On Wednesday 11 October 2006 20:30, Erik Hatcher wrote: Erick - what about using getSpans() from the SpanQuery that is generated? That should give you what you're after I think. Erik You can also use skipTo(docNr) on the spans to skip to the docNr of the book that you're after. A

Big problem with big indexes

2006-10-11 Thread Ariel Isaac Romero Cartaya
Hi everybody: I have a big problem making prallel searches in big indexes. I have indexed with lucene over 60 000 articles, I have distributed the indexes in 10 computers nodes so each index not exceed the 60 MB of size. I makes parallel searches in those indexes but I get the search

Re: How to fire a query ?

2006-10-11 Thread Daniel Noll
Bhavin Pandya wrote: Hi guys, How to fire digital camera when someone fire digital cam .. ? Do i need to make manual list for such items and look up at search time or theree is any better way to do this... You will need a list, but you may not need to make it manually (look at WordNet, I'm

Re: Big problem with big indexes

2006-10-11 Thread Erick Erickson
Something's extremely not right G First of all, I'm running a 1.4G index on a single machine and getting very good results, under 10 seconds even for the most complex queries I'm firing. This is with 870,000 documents, and includes sorting by criteria other than relevance. And using span

Re: Big problem with big indexes

2006-10-11 Thread Doron Cohen
These times really are not reasonable. But 60K do not seem much for Lucene. I once indexed ~1M docs of ~20K each, that's ~20GB input collection. The result index size was ~2.5GB and the search times for a short query 2-3 words free text (or) query was ~300ms for a hot query and ~900ms for a cold