Re: Permissioning Documents

2004-12-10 Thread Paul Elschot
On Friday 10 December 2004 07:10, Steve Skillcorn wrote: Hi; I'm currently using Lucene (which I am extremely impressed with BTW) to index a knowledge base of documents. One issue I have is that only certain documents are available to certain users (or groups). The number of documents is

Lucene in Action e-book now available!

2004-12-10 Thread Erik Hatcher
The Lucene in Action e-book is now available at Manning's site: http://www.manning.com/hatcher2 Manning also put lots of other goodies there, the table of contents, about this book, preface, the foreward from Doug Cutting himself (thanks Doug!!!), and a couple of sample chapters. The

Re: Permissioning Documents

2004-12-10 Thread mark harwood
Hi Steve, Possibly the easiest way to handle this is to tag the documents with a field listing the permitted roles/groups (not the individual users). I would be tempted to keep the information that associates users to groups outside of the Lucene index eg in a relational DB. This way you do not

HITCOLLECTOR+SCORE+DELIMA

2004-12-10 Thread Karthik N S
Hi guys Apologies. I am still in delima on How to use the HitCollector for returning Hits hits between scores 0.2f to 1.0f , There is not a simple example for the same, yet lot's of talk on usage for the same on the form. Please somebody spare a bit of code (u'r intelligence)

RE: Lucene in Action e-book now available!

2004-12-10 Thread William W
Am I the first one who bought the Lucene in Action book ? Thanks Erik and Otis. William W. Silva From: Erik Hatcher [EMAIL PROTECTED] Reply-To: Lucene Users List [EMAIL PROTECTED] To: Lucene User [EMAIL PROTECTED],Lucene List [EMAIL PROTECTED] Subject: Lucene in Action e-book now available!

Re: HITCOLLECTOR+SCORE+DELIMA

2004-12-10 Thread Erik Hatcher
On Dec 10, 2004, at 7:39 AM, Karthik N S wrote: I am still in delima on How to use the HitCollector for returning Hits hits between scores 0.2f to 1.0f , There is not a simple example for the same, yet lot's of talk on usage for the same on the form. Unfortunately there isn't a clean way to

Re: SEARCH +HITS+LIMIT

2004-12-10 Thread Erik Hatcher
On Dec 10, 2004, at 8:24 AM, Andraz Skoric wrote: Displaytag (http://displaytag.sourceforge.net/) is for displaying search results in multiple pages I don't know displaytag internals, but be cautious with such things. What you do not want to happen is all the results to be grabbed and cached

Re: Lucene in Action e-book now available!

2004-12-10 Thread Luke Shannon
Nice Work! Congratulations Guys. - Original Message - From: Erik Hatcher [EMAIL PROTECTED] To: Lucene User [EMAIL PROTECTED]; Lucene List [EMAIL PROTECTED] Sent: Friday, December 10, 2004 3:52 AM Subject: Lucene in Action e-book now available! The Lucene in Action e-book is now

Re: Lucene in Action e-book now available!

2004-12-10 Thread Robinson Raju
Congrats ! i went through sample chapter 1 . well written . On Fri, 10 Dec 2004 09:58:25 -0500, Luke Shannon [EMAIL PROTECTED] wrote: Nice Work! Congratulations Guys. - Original Message - From: Erik Hatcher [EMAIL PROTECTED] To: Lucene User [EMAIL PROTECTED]; Lucene List

Re: OutOfMemoryError with Lucene 1.4 final

2004-12-10 Thread Justin Swanhart
You probably need to increase the amount of RAM available to your JVM. See the parameters: -Xmx :Maximum memory usable by the JVM -Xms :Initial memory allocated to JVM My params are; -Xmx2048m -Xms128m (2G max, 128M initial) On Fri, 10 Dec 2004 11:17:29 -0600, Sildy Augustine [EMAIL

Re: OutOfMemoryError with Lucene 1.4 final

2004-12-10 Thread Xiangyu Jin
I am not sure. But guess there are three possilities, (1). see that you use Field.Text(contents, stringBuffer.toString()) This will store all your string of text into document object. And it might be long ... I do not know the detail how Lucene implemented. I think you can try use unstored

sorting tokenized field

2004-12-10 Thread Praveen Peddi
I read that the tokenised fields cannot be sorted. In order to sort tokenized field, either the application has to duplicate field with diff name and not tokenize it or come up with something else. But shouldn't the search engine takecare of this? Are there any plans of putting this

Re: sorting tokenized field

2004-12-10 Thread Erik Hatcher
On Dec 10, 2004, at 1:40 PM, Praveen Peddi wrote: I read that the tokenised fields cannot be sorted. In order to sort tokenized field, either the application has to duplicate field with diff name and not tokenize it or come up with something else. But shouldn't the search engine takecare of

Re: sorting tokenized field

2004-12-10 Thread Praveen Peddi
I was only thinking in terms of other search engines. I worked with other search engines and I didn't see this requirements before. I think you are right that its wasteful to duplicate all tokenized fields. Not sure if there is a smart of dealing with it. Praveen - Original Message -

Re: OutOfMemoryError with Lucene 1.4 final

2004-12-10 Thread Jin, Ying
Great!!! It works perfect after I setup -Xms and -Xmx JVM command-line parameters with: java -Xms128m -Xmx128m It turns out that my JVM is running out of memory. And Otis is right on my reader closing too. reader.close() will close the reader and release any system resources associated with

No of docs using IndexSearcher

2004-12-10 Thread Ravi
How do I get the number of docs in an index If I just have access to a searcher on that index? Thanks in advance Ravi. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: No of docs using IndexSearcher

2004-12-10 Thread [EMAIL PROTECTED]
numDocs() http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexReader.html#numDocs() Ravi said the following on 12/10/2004 2:42 PM: How do I get the number of docs in an index If I just have access to a searcher on that index? Thanks in advance Ravi.

Re: No of docs using IndexSearcher

2004-12-10 Thread [EMAIL PROTECTED]
If your index is open shouldnt there be an instance of IndexReader already there? Ravi said the following on 12/10/2004 3:13 PM: I already have a field with a constant value in my index. How about using IndexSearcher.docFreq(new Term(field,value))? Then I don't have to instantiate IndexReader.

RE: No of docs using IndexSearcher

2004-12-10 Thread Ravi
I'm fairly new to lucene. The main reason why I did n't use the IndexReader constructor for the searcher is we organize the indexes as different partitions depending on document's date and during searching I instantiate a MultiSearcher object on these different partitions depending on from-date

Re: sorting tokenized field

2004-12-10 Thread Praveen Peddi
Since I am not aware of the lucene code much, I couldn't make much out of your patch. But is this patch already tested and proved to be efficient? If so, why can't it be merge into the lucene code and made it part of the release. I think the bug is valid. Its very likely that people want to

Sorting based on calculations at search time

2004-12-10 Thread Gurukeerthi Gurunathan
Hello, I'd like some suggestions on the following scenario. Say I have an index with a stored, indexed field called 'weight'(essentially an int stored as string). I'd like to sort in descending order of final weight, the search results by performing a calculation involving the lucene score

Re: Lucene in Action e-book now available!

2004-12-10 Thread Jonathan Hager
Congratulations on the book. I ordered my copy the other day via regular post and am eagerly awaiting it. It looks like it will make lucene available to a much wider audience. Based on the table of contents, I wanted to toss out a couple of ideas for your next book or articles. 1. I didn't see

RE: Sorting based on calculations at search time

2004-12-10 Thread Gurukeerthi Gurunathan
Thanks Otis for your response and compliments (wish I was a lucene guru like you guys :-) I believe you are talking about the boost factor for fields or documents while searching. That does not apply in my case - maybe I am missing a point here. The weight field I was talking about is only for

A simple Query Language

2004-12-10 Thread Dongling Ding
Hi, I am going to implement a search service and plan to use Lucene. Is there any simple query language that is independent of any particular search engine out there? Thanks Dongling If you have received

RE: A simple Query Language

2004-12-10 Thread Chuck Williams
You could support only terms with no operators at all, which will work in most search engines (except those that require combining operators). Using just terms and phrases embedded in 's is pretty universal. After that, you might want to add +/- required/prohibited restrictions, which many engines

Re: Incremental Search experiment with Lucene, sort of like the new Google Suggestion page

2004-12-10 Thread Chris Lamprecht
Very cool, thanks for posting this! Google's feature doesn't seem to do a search on every keystroke necessarily. Instead, it waits until you haven't typed a character for a short period (I'm guessing about 100 or 150 milliseconds). So if you type fast, it doesn't hit the server until you

RE: Sorting based on calculations at search time

2004-12-10 Thread Chris Hostetter
: I believe you are talking about the boost factor for fields or documents : while searching. That does not apply in my case - maybe I am missing a : point here. : The weight field I was talking about is only for the calculation Otis is suggesting that you set the boost of the document to be your

Re: SEARCH +HITS+LIMIT

2004-12-10 Thread Andraz Skoric
Displaytag (http://displaytag.sourceforge.net/) is for displaying search results in multiple pages lp, a Karthik N S wrote: Hi Guy's Apologies... One question for the form [ Especially Erik] 1) I have a MERGED Index with 100,000 File Indexed into it ( Content is one of the

RE: OutOfMemoryError with Lucene 1.4 final

2004-12-10 Thread Sildy Augustine
I think you should close your files in a finally clause in case of exceptions with file system and also print out the exception. You could be running out of file handles. -Original Message- From: Jin, Ying [mailto:[EMAIL PROTECTED] Sent: Friday, December 10, 2004 11:15 AM To: [EMAIL

OutOfMemoryError with Lucene 1.4 final

2004-12-10 Thread Jin, Ying
Hi, Everyone, We're trying to index ~1500 archives but get OutOfMemoryError about halfway through the index process. I've tried to run program under two different Redhat Linux servers: One with 256M memory and 365M swap space. The other one with 512M memory and 1G swap space. However, both got

RE: OutOfMemoryError with Lucene 1.4 final

2004-12-10 Thread Otis Gospodnetic
Ying, You should follow this finally block advice below. In addition, I think you can just close the reader, and it will close the underlying stream (I'm not sure about that, double-check it). You are not running out of file handles, though. Your JVM is running out of memory. You can play

Re: OutOfMemoryError with Lucene 1.4 final

2004-12-10 Thread Xiangyu Jin
Ok, I see. Seems most ppl think is the third possiblity On Fri, 10 Dec 2004, Xiangyu Jin wrote: I am not sure. But guess there are three possilities, (1). see that you use Field.Text(contents, stringBuffer.toString()) This will store all your string of text into document object. And it

RE: sorting tokenized field

2004-12-10 Thread Aviran
I have suggested a solution for this problem ( http://issues.apache.org/bugzilla/show_bug.cgi?id=30382 ) you can use the patch suggested there and recompile lucene. Aviran http://www.aviransplace.com -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Friday, December

MultiSearcher close

2004-12-10 Thread Ravi
If I close a MultiSearcher, does it close all the associated searchers too? I was getting a bad file descriptor error, if I close the MultiSearcher object and open it again for another search without reinstantiating the underlying searchers. Thanks in advance, Ravi

Re: MultiSearcher close

2004-12-10 Thread Erik Hatcher
On Dec 10, 2004, at 4:16 PM, Ravi wrote: If I close a MultiSearcher, does it close all the associated searchers too? It sure does: public void close() throws IOException { for (int i = 0; i searchables.length; i++) searchables[i].close(); } I was getting a bad file descriptor

Incremental Search experiment with Lucene, sort of like the new Google Suggestion page

2004-12-10 Thread David Spencer
Google just came out with a page that gives you feedback as to how many pages will match your query and variations on it: http://www.google.com/webhp?complete=1hl=en I had an unexposed experiment I had done with Lucene a few months ago that this has inspired me to expose - it's not the same,