Re: Negative Boost
Daniel Naber writes: > On Wednesday 04 August 2004 13:19, Terry Steichen wrote: > > > I can't get negative boosts to work with QueryParser. Is it possible to do > > so? > > Isn't that the same as using a boost < 1, e.g. 0.1? That should be possible. > no. a^-1 OR b A boost of -1 means that the score gets smaller if a document contains a with that boost appears. So it's somehow similar to NOT a, though less strict. A boost of 0.1 means that the score is increased less for an occurance of a. Usually one just want's the latter, but it's not the same. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Split an existing index into smaller segments without a re-index?
Kevin A. Burton wrote: Is it possible to take an existing index (say 1G) and break it up into a number of smaller indexes (say 10 100M indexes)... I don't think theres currently an API for this but its certainly possible (I think). Yes, it is theoretically possible but not yet implemented. An easy way to implement it would be to subclass FilterIndexReader to return a subset of documents, then use IndexWriter.addIndexes() to write out each subset as a new index. Subsets could be ranges of document numbers, and one could use TermPositions.skipTo() to accelerate the TermPositions subset implementation, but this still wouldn't be quite as fast as an index splitter that only reads each TermPositions once. If we added a lower-level index writing API then one could use that to implement this... Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Split an existing index into smaller segments without a re-index?
Is it possible to take an existing index (say 1G) and break it up into a number of smaller indexes (say 10 100M indexes)... I don't think theres currently an API for this but its certainly possible (I think). Kevin -- Please reply using PGP. http://peerfear.org/pubkey.asc NewsMonster - http://www.newsmonster.org/ Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965 AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 IRC - freenode.net #infoanarchy | #p2p-hackers | #newsmonster - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Question on the minimum value for DateField
The date is stored as a Long that is the number of seconds since jan 1970. Anything before that would be negative. -Original Message- From: Terence Lai [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 04, 2004 6:25 PM To: Lucene Users List Subject: Question on the minimum value for DateField Hi All, I realize that the DateField cannot except the value which is before the Year 1970, specifically in the org.apache.lucene.document.DateField.timeToString() method. Is there are any techincal reason for this limitation? Thanks, Terence -- Get your free email account from http://www.trekspace.com Your Internet Virtual Desktop! - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Question on number of fields in a document.
Thanks I was looking at some older email on the list and found an email where Doug Cutting says that fields not analyzed, we need not store the norms , nor load them into memory. That change in the indexer will help a lot in this situation, where we might have 24 fields indexed but not analyzed. ZJ Paul Elschot <[EMAIL PROTECTED]> wrote: On Wednesday 04 August 2004 18:22, John Z wrote: > Hi > > I had a question related to number of fields in a document. Is there any > limit to the number of fields you can have in an index. > > We have around 25-30 fields per document at present, about 6 are keywords, > Around 6 stored, but not indexed and rest of them are text, which is > analyzed and indexed fields. We are planning on adding around 24 more > fields , mostly keywords. > > Does anyone see any issues with this? Impact to search or index ? During search one byte of RAM is needed per searched field per document for the normalisation factors, even if a document field is empty. This RAM is occupied the first time a field is searched after opening an index reader. Supposing your queries would actually search 50 fields before closing the index reader, the norms would occupy 50 bytes/doc, or 1 GB / 20MDocs. Regards, Paul Regards, Paul - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Question on the minimum value for DateField
Hi All, I realize that the DateField cannot except the value which is before the Year 1970, specifically in the org.apache.lucene.document.DateField.timeToString() method. Is there are any techincal reason for this limitation? Thanks, Terence -- Get your free email account from http://www.trekspace.com Your Internet Virtual Desktop! - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Negative Boost
A solution to this has been proposed before - see http://wiki.apache.org/jakarta-lucene/CommunityContributions Cheers Mark - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Negative Boost
Terry Steichen wrote: But if, in the future, I or someone else took on this task of enhancing QueryParser, I'd like to be assured that the underlying Lucene engine will accept and support negative boosting. Is that the case? Lucene will multiply negative boosts into scores just like positive ones. I've never been convinced that it makes much sense to use negative boosts in a scoring formula such as Lucene's, but there's nothing stopping you from using them. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Question on number of fields in a document.
On Wednesday 04 August 2004 18:22, John Z wrote: > Hi > > I had a question related to number of fields in a document. Is there any > limit to the number of fields you can have in an index. > > We have around 25-30 fields per document at present, about 6 are keywords, > Around 6 stored, but not indexed and rest of them are text, which is > analyzed and indexed fields. We are planning on adding around 24 more > fields , mostly keywords. > > Does anyone see any issues with this? Impact to search or index ? During search one byte of RAM is needed per searched field per document for the normalisation factors, even if a document field is empty. This RAM is occupied the first time a field is searched after opening an index reader. Supposing your queries would actually search 50 fields before closing the index reader, the norms would occupy 50 bytes/doc, or 1 GB / 20MDocs. Regards, Paul Regards, Paul - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Question on number of fields in a document.
You should be fine, no problem with the number of fields -Original Message- From: John Z [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 04, 2004 12:23 PM To: [EMAIL PROTECTED] Subject: Question on number of fields in a document. Hi I had a question related to number of fields in a document. Is there any limit to the number of fields you can have in an index. We have around 25-30 fields per document at present, about 6 are keywords, Around 6 stored, but not indexed and rest of them are text, which is analyzed and indexed fields. We are planning on adding around 24 more fields , mostly keywords. Does anyone see any issues with this? Impact to search or index ? Thanks ZJ - Do you Yahoo!? New and Improved Yahoo! Mail - Send 10MB messages! - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Negative Boost
Well, I'm not too confident of my JavaCC skills, and when I've messed around with this stuff in the past, I sometimes ended up inadvertently creating problems in other areas of the query syntax. But if, in the future, I or someone else took on this task of enhancing QueryParser, I'd like to be assured that the underlying Lucene engine will accept and support negative boosting. Is that the case? Regards, Terry - Original Message - From: Erik Hatcher To: Lucene Users List Sent: Wednesday, August 04, 2004 9:12 AM Subject: Re: Negative Boost On Aug 4, 2004, at 7:19 AM, Terry Steichen wrote: > I can't get negative boosts to work with QueryParser. Is it possible > to do so? Closer inspection on the parsing: TOKEN : { )+ ( "." (<_NUM_CHAR>)+ )? > : DEFAULT } where <#_NUM_CHAR: ["0"-"9"] > So, no, negative boosts don't appear possible with QueryParser currently. I have no objections if you'd like to enhance the grammar to allow for it (provided sufficient unit tests, of course). Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Negative Boost
Near as I can tell, setting the boost to, say, 0.10, doesn't seem to do anything. Regards, Terry - Original Message - From: Otis Gospodnetic To: Lucene Users List Sent: Wednesday, August 04, 2004 9:38 AM Subject: Re: Negative Boost You can just use boost that is < 1.0, no? Otis --- Terry Steichen <[EMAIL PROTECTED]> wrote: > I can't get negative boosts to work with QueryParser. Is it possible > to do so? > > TIA, > > Terry > > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Negative Boost
On Wednesday 04 August 2004 13:19, Terry Steichen wrote: > I can't get negative boosts to work with QueryParser. Is it possible to do > so? Isn't that the same as using a boost < 1, e.g. 0.1? That should be possible. Regards Daniel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Question on number of fields in a document.
Hi I had a question related to number of fields in a document. Is there any limit to the number of fields you can have in an index. We have around 25-30 fields per document at present, about 6 are keywords, Around 6 stored, but not indexed and rest of them are text, which is analyzed and indexed fields. We are planning on adding around 24 more fields , mostly keywords. Does anyone see any issues with this? Impact to search or index ? Thanks ZJ - Do you Yahoo!? New and Improved Yahoo! Mail - Send 10MB messages!
Re: Hit & Score [ Between ]
You could instead use a HitCollector to gather only documents with scores in that range. Doug Karthik N S wrote: Hi Apologies If I want to get all the hits for Scores between 0.5f to 0.8f, I usally use query = QueryParser.parse(srchkey,Fields, analyzer); int tothits = searcher.search(query); for (int i = 0; i docs = hits.doc(i); Score = hits.score(i); if ((Score > 0.5f ) && (Score < 0.8f) ) { System.out.println(" FileName : " + docs.get("filename"); } } Is there any other way to Do this , Please Advise me.. Thx. WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Negative Boost
You can just use boost that is < 1.0, no? Otis --- Terry Steichen <[EMAIL PROTECTED]> wrote: > I can't get negative boosts to work with QueryParser. Is it possible > to do so? > > TIA, > > Terry > > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Negative Boost
On Aug 4, 2004, at 7:19 AM, Terry Steichen wrote: I can't get negative boosts to work with QueryParser. Is it possible to do so? Closer inspection on the parsing: TOKEN : { )+ ( "." (<_NUM_CHAR>)+ )? > : DEFAULT } where <#_NUM_CHAR: ["0"-"9"] > So, no, negative boosts don't appear possible with QueryParser currently. I have no objections if you'd like to enhance the grammar to allow for it (provided sufficient unit tests, of course). Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Negative Boost
On Aug 4, 2004, at 7:19 AM, Terry Steichen wrote: I can't get negative boosts to work with QueryParser. Is it possible to do so? More details please. - What exact query expression did you use? - Did you get an error? If so, what was it? - What does Query.toString() output? Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Negative Boost
Terry Steichen writes: > I can't get negative boosts to work with QueryParser. Is it possible to do so? > If you change QueryParser ;-) Morus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Negative Boost
I can't get negative boosts to work with QueryParser. Is it possible to do so? TIA, Terry
Re: search exception in servlet!Please help me
My deepest apologies - I totally misspoke with my post yesterday. Chris, and the others, are correct - I wasn't thinking clearly and was confusing IndexReader.document() with Hits.doc(). So, as far as the exception goes - perhaps your servlet does not have access to the index because of permissions. Maybe you're using a different version of Lucene between the command-line and your web application? Erik On Aug 4, 2004, at 3:14 AM, Christiaan Fluit wrote: Erik Hatcher wrote: Where did you get 'i'? Keep in mind that using Hits.doc(n) intends 'n' to be a document *id*, not the iteration through the Hits collection. This is a very common mistake, and I'm guessing one you've made here. I believe the Javadoc (as well as my own experience) tells otherwise: "public final Document doc(int n) throws IOException Returns the stored fields of the nth document in this set." Regards, Chris -- - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: search exception in servlet!Please help me
Erik Hatcher wrote: Where did you get 'i'? Keep in mind that using Hits.doc(n) intends 'n' to be a document *id*, not the iteration through the Hits collection. This is a very common mistake, and I'm guessing one you've made here. I believe the Javadoc (as well as my own experience) tells otherwise: "public final Document doc(int n) throws IOException Returns the stored fields of the nth document in this set." Regards, Chris -- - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]