Re: Lucene search performance: linear?

2006-12-05 Thread Daniel Naber
On Tuesday 05 December 2006 03:49, Zhang, Lisheng wrote: I found that search time is about linear: 2nd time is about 2 times longer than 1st query. What exactly did you measure, only the search() or also opening the IndexSearcher? The later depends on index size, thus you shouldn't re-open

IOException: Access is denied from IndexWriter.Optimize

2006-12-05 Thread trond . lindanger
Hi, In my test case, four Quartz jobs are starting each third minute storing records in a database followed by an index update. After doing a test run over a period of 16 hours, I got this exception after 10 hours: java.io.IOException: Access is denied at

Re: IOException: Access is denied from IndexWriter.Optimize

2006-12-05 Thread trond . lindanger
Forgot something... Also I got this exception, which may be related: java.io.IOException: Cannot delete C:\dknewscenter\2\_5d.cfs at org.apache.lucene.store.FSDirectory.create(FSDirectory.java:319) at org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:208)

RE: Problem: The selected method Keyword was not found

2006-12-05 Thread Aaron Shaw
Hi, thank you both for your help. Where would I find this Contributions? Aaron Risov, Maria wrote: It's in Contributions rather than being in the core Lucene folder. Marie Risov -Original Message- From: Erick Erickson [mailto:[EMAIL PROTECTED] Sent: Monday, December

Re: Lucene search performance: linear?

2006-12-05 Thread Michael McCandless
Zhang, Lisheng wrote: Hi, I indexed first 220,000, all with a special keyword, I did a simple query and only fetched 5 docs, with Hits.length()=220,000. Then I indexed 440,000 docs, with the same keyword, query it again and fetched a few docs, with Hits.length(0=440,000. I found that search

Re: IOException: Access is denied from IndexWriter.Optimize

2006-12-05 Thread Michael McCandless
[EMAIL PROTECTED] wrote: Forgot something... Also I got this exception, which may be related: java.io.IOException: Cannot delete C:\dknewscenter\2\_5d.cfs at org.apache.lucene.store.FSDirectory.create(FSDirectory.java:319) at

Re: {SPAM 05.2 _____} IOException: Access is denied from IndexWriter.Optimize

2006-12-05 Thread Michael McCandless
[EMAIL PROTECTED] wrote: Hi, In my test case, four Quartz jobs are starting each third minute storing records in a database followed by an index update. After doing a test run over a period of 16 hours, I got this exception after 10 hours: java.io.IOException: Access is denied at

Re: IOException: Access is denied from IndexWriter.Optimize

2006-12-05 Thread trond . lindanger
Thank you for quick and detailed answer. In this system multiple threads will, occasionally, try to write and/ or read the same index, hence the pause waiting for the lock. This is not a good way to implement it and was done as a temp solution for debug purposes only. Multiple processes may

Store a document-like map

2006-12-05 Thread [EMAIL PROTECTED]
Hi, I'm building an application that's going to classify some documents. So i have a set of documents and a set of classes, and I must classify these docs in these classes. Now, documents are stored in Lucene index through Document, while I don't know how I can store my classes in Lucene, and

Store document-like map

2006-12-05 Thread [EMAIL PROTECTED]
Hi, I'm building an application that's going to classify some documents. So i have a set of documents and a set of classes, and I must classify these docs in these classes. Now, documents are stored in Lucene index through Document, while I don't know how I can store my classes in Lucene, and

Re: IOException: Access is denied from IndexWriter.Optimize

2006-12-05 Thread Michael McCandless
[EMAIL PROTECTED] wrote: Thank you for quick and detailed answer. In this system multiple threads will, occasionally, try to write and/ or read the same index, hence the pause waiting for the lock. This is not a good way to implement it and was done as a temp solution for debug purposes only.

Re: Full disk space during indexing process with 120 gb of free disk space

2006-12-05 Thread Ariel Isaac Romero Cartaya
Here is my source code where I convert pdf files to text for indexing, I got this source code from lucene in action examples and adapted it for my convenience, I hop you could help me to fix this problem, anyway if you know another more efficient way to do it please tell me how to: import

RE: Problem: The selected method Keyword was not found

2006-12-05 Thread Risov, Maria
Aaron, When you download Lucene from one of the mirrors http://www.apache.org/dyn/closer.cgi/lucene/java/ (you are using the Java version, right?), you should see packages named lucene-core-2.0.0.jar. These contain all Lucene modules and other components that became standard. You need the

RE: Lucene search performance: linear?

2006-12-05 Thread Zhang, Lisheng
Hi, Thanks for the reply, I only measure search(), I cached IndexSearcher in memory. Best regards, Lisheng -Original Message- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 05, 2006 12:22 AM To: java-user@lucene.apache.org Subject: Re: Lucene search performance:

Re: Lucene search performance: linear?

2006-12-05 Thread Soeren Pekrul
Hello Lisheng, a search process has to do usually two thinks. First it has to find the term in the index. I don’t know the implementation of finding a term in Lucene. I hope that the index is at least a sorted list or a binary tree, so it can search binary. The time finding a term depends of

Customized Analyzer

2006-12-05 Thread Alice
Hello! I wrote a custom analyzer that has synonyms of some words to help on search. I use the analyzer when searching the user's entered keyword. What is happening that I don't understand why is that when tokens are returned from the synonyms set, the query parser returns the query

Re: Customized Analyzer

2006-12-05 Thread Chris Hostetter
: I search my synonyms set and if I find something I return the token like: : return new Token(synonyms[i], token.startOffset(), token.endOffset(), : token.type()); : And when it gets do the query I see: : : content:wind window When you add your synonym, it's just going into the stream of

RE: Customized Analyzer

2006-12-05 Thread Alice
Sorry, I forgot to include this information: Doing: token.setPositionIncrement(0); It returns content:(wind window) With: token.setPositionIncrement(1); Returns: content:wind window I really don't get it.. -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent:

Re: Customized Analyzer

2006-12-05 Thread Daniel Naber
On Tuesday 05 December 2006 21:37, Alice wrote: It does not work. Even with the synonyms indexed it is not found. So if your text contains wind it is not found by the query that prints as content:(wind window)? Then I suggest you post a small test case that shows this problem. As Chris

Re: Lucene on SQL 2005

2006-12-05 Thread Chris Lu
Sounds a very simple and typical use case for a product catalog search. You are welcome to try DBSight, which is a J2EE web server that has UI and wizards for you to select data, configure search, and can run as a production-level search server. -- Chris Lu - Instant

Re: Customized Analyzer

2006-12-05 Thread Daniel Naber
On Tuesday 05 December 2006 20:14, Alice wrote: It returns content:(wind window) That might be the correct representation of a MultiPhraseQuery. So does your query work anyway? It's just that you cannot use QueryParser again to parse this output (similar to some other queries like

too many parentheses confuse Lucene

2006-12-05 Thread Daniel Naber
Hi, a query like (-merkel) AND schröder is parsed as +(-body:merkel) +body:schröder I get no hits for this query because +(-body:merkel) doesn't return any hits (it's not a valid query for Lucene). However, a query like -merkel AND schröder works fine. From the user's point-of-view, both

RE: Customized Analyzer

2006-12-05 Thread Alice
It does not work. Even with the synonyms indexed it is not found. That's why my guess was to remove the but I don’t know how. -Original Message- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: terça-feira, 5 de dezembro de 2006 18:34 To: java-user@lucene.apache.org Subject: Re:

Re: Customized Analyzer

2006-12-05 Thread Mark Miller
Just took a quick peak at the MultiPhraseQuery toString() and it does indeed wrap the query in quotes (it also puts in the parenthesis). You are generating a MultiPhraseQuery. Is that not your intent?. The QueryParser will generate a MultiPhraseQuery when more than one token with different

RE: Customized Analyzer

2006-12-05 Thread Alice
Ok, This is the method that adds the aliases, it is located in my SynonymFilter: private void addAliasesToStack(Token token) { String[] synonyms = engine.getSynonyms(contents, token.termText()); if (synonyms == null) { return; }

RE: Lucene search performance: linear?

2006-12-05 Thread Zhang, Lisheng
Hi Soeren, Thanks very much for explanations, yes, there is no linear relation when searching a keyword which is only in a few docs. Best regards, Lisheng -Original Message- From: Soeren Pekrul [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 05, 2006 10:37 AM To:

RE: Customized Analyzer

2006-12-05 Thread Alice
No.. I am not indexing and searching with the same analyzer. The reason I do this is because I want to index exactly the contents I have in my database. This is used to find some products the company sells, and the users don’t write their names correctly, so if they type something that is

Re: too many parentheses confuse Lucene

2006-12-05 Thread Chris Hostetter
: works fine. From the user's point-of-view, both queries should return the : same result set. One solution I see is to add a MatchAllDocsQuery clause : to all prohibited clauses in QueryParser's getBooleanQuery() method. Is : that a valid solution? I tried with some simple cases and it seems to

RE: Customized Analyzer

2006-12-05 Thread Chris Hostetter
As stated before, a *self contained* test case would help people diagnose your problem ... just cutting and pasting a few snippets of your code is not enough for people to reproduce your problem. : And the return is: contents:(wind window) a MultiPhraseQuery that looks like that should be

New Lucene QueryParser

2006-12-05 Thread Mark Miller
I have finally delved back into the Lucene Query parser that I started a few months back. I am very closing to wrapping up it's initial development. I am currently looking for anybody willing to help me out with a little testing and maybe some design consultation (I am not happy with the