Re: Negative Boost

2004-08-04 Thread Morus Walter
Daniel Naber writes:
> On Wednesday 04 August 2004 13:19, Terry Steichen wrote:
> 
> > I can't get negative boosts to work with QueryParser.  Is it possible to do
> > so?
> 
> Isn't that the same as using a boost < 1, e.g. 0.1? That should be possible.
> 
no.
a^-1 OR b
A boost of -1 means that the score gets smaller if a document contains a 
with that boost appears. So it's somehow similar to NOT a, though less strict.
A boost of 0.1 means that the score is increased less for an occurance of a.

Usually one just want's the latter, but it's not the same.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Split an existing index into smaller segments without a re-index?

2004-08-04 Thread Doug Cutting
Kevin A. Burton wrote:
Is it possible to take an existing index (say 1G) and break it up into a 
number of smaller indexes (say 10 100M indexes)...

I don't think theres currently an API for this but its certainly 
possible (I think).
Yes, it is theoretically possible but not yet implemented.
An easy way to implement it would be to subclass FilterIndexReader to 
return a subset of documents, then use IndexWriter.addIndexes() to write 
out each subset as a new index.  Subsets could be ranges of document 
numbers, and one could use TermPositions.skipTo() to accelerate the 
TermPositions subset implementation, but this still wouldn't be quite as 
fast as an index splitter that only reads each TermPositions once.  If 
we added a lower-level index writing API then one could use that to 
implement this...

Doug

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Split an existing index into smaller segments without a re-index?

2004-08-04 Thread Kevin A. Burton
Is it possible to take an existing index (say 1G) and break it up into a 
number of smaller indexes (say 10 100M indexes)...

I don't think theres currently an API for this but its certainly 
possible (I think).

Kevin
--
Please reply using PGP.
   http://peerfear.org/pubkey.asc
   
   NewsMonster - http://www.newsmonster.org/
   
Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965
  AIM/YIM - sfburtonator,  Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
 IRC - freenode.net #infoanarchy | #p2p-hackers | #newsmonster

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: Question on the minimum value for DateField

2004-08-04 Thread wallen
The date is stored as a Long that is the number of seconds since jan 1970.
Anything before that would be negative.

-Original Message-
From: Terence Lai [mailto:[EMAIL PROTECTED]
Sent: Wednesday, August 04, 2004 6:25 PM
To: Lucene Users List
Subject: Question on the minimum value for DateField


Hi All,

I realize that the DateField cannot except the value which is before the
Year 1970, specifically in the
org.apache.lucene.document.DateField.timeToString() method. Is there are any
techincal reason for this limitation?

Thanks,
Terence




--
Get your free email account from http://www.trekspace.com
  Your Internet Virtual Desktop!

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Question on number of fields in a document.

2004-08-04 Thread John Z
Thanks
I was looking at some older email on the list and found an email where Doug Cutting 
says that fields not analyzed, we need not store the norms , nor load them into memory.
 
That change in the indexer will help a lot in this situation, where we might have 24 
fields indexed but not analyzed.
 
ZJ

Paul Elschot <[EMAIL PROTECTED]> wrote:
On Wednesday 04 August 2004 18:22, John Z wrote:
> Hi
>
> I had a question related to number of fields in a document. Is there any
> limit to the number of fields you can have in an index.
>
> We have around 25-30 fields per document at present, about 6 are keywords, 
> Around 6 stored, but not indexed and rest of them are text, which is
> analyzed and indexed fields. We are planning on adding around 24 more
> fields , mostly keywords.
>
> Does anyone see any issues with this? Impact to search or index ?

During search one byte of RAM is needed per searched field per document
for the normalisation factors, even if a document field is empty.
This RAM is occupied the first time a field is searched after opening
an index reader.
Supposing your queries would actually search 50 fields before
closing the index reader, the norms would occupy 50 bytes/doc, or
1 GB / 20MDocs.

Regards,
Paul

Regards,
Paul


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Question on the minimum value for DateField

2004-08-04 Thread Terence Lai
Hi All,

I realize that the DateField cannot except the value which is before the Year 1970, 
specifically in the org.apache.lucene.document.DateField.timeToString() method. Is 
there are any techincal reason for this limitation?

Thanks,
Terence




--
Get your free email account from http://www.trekspace.com
  Your Internet Virtual Desktop!

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Negative Boost

2004-08-04 Thread markharw00d
A solution to this has been proposed before - see 
http://wiki.apache.org/jakarta-lucene/CommunityContributions

Cheers
Mark

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Negative Boost

2004-08-04 Thread Doug Cutting
Terry Steichen wrote:
But if, in the future, I or someone else took on this task of enhancing QueryParser, I'd like to be assured that the underlying Lucene engine will accept and support negative boosting.  Is that the case?
Lucene will multiply negative boosts into scores just like positive 
ones.  I've never been convinced that it makes much sense to use 
negative boosts in a scoring formula such as Lucene's, but there's 
nothing stopping you from using them.

Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Question on number of fields in a document.

2004-08-04 Thread Paul Elschot
On Wednesday 04 August 2004 18:22, John Z wrote:
> Hi
>
> I had a question related to number of fields in a document. Is there any
> limit to the number of fields you can have in an index.
>
> We have around 25-30 fields per document at present, about 6 are keywords, 
> Around 6 stored, but not indexed and rest of them are text, which is
> analyzed and indexed fields. We are planning on adding around 24 more
> fields , mostly keywords.
>
> Does anyone see any issues with this? Impact to search or index ?

During search one byte of RAM is needed per searched field per document
for the normalisation factors, even if a document field is empty.
This RAM is occupied the first time a field is searched after opening
an index reader.
Supposing your queries would actually search 50 fields before
closing the index reader, the norms would occupy 50 bytes/doc, or
1 GB / 20MDocs.

Regards,
Paul

Regards,
Paul


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Question on number of fields in a document.

2004-08-04 Thread Aviran
You should be fine, no problem with the number of fields

-Original Message-
From: John Z [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, August 04, 2004 12:23 PM
To: [EMAIL PROTECTED]
Subject: Question on number of fields in a document.


Hi
 
I had a question related to number of fields in a document. Is there any
limit to the number of fields you can have in an index.
 
We have around 25-30 fields per document at present, about 6 are keywords,
Around 6 stored, but not indexed and rest of them are text, which is
analyzed and indexed fields. We are planning on adding around 24 more fields
, mostly keywords.
 
Does anyone see any issues with this? Impact to search or index ?
 
Thanks
ZJ




-
Do you Yahoo!?
New and Improved Yahoo! Mail - Send 10MB messages!



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Negative Boost

2004-08-04 Thread Terry Steichen
Well, I'm not too confident of my JavaCC skills, and when I've messed around with this 
stuff in the past, I sometimes ended up inadvertently creating problems in other areas 
of the query syntax. 

But if, in the future, I or someone else took on this task of enhancing QueryParser, 
I'd like to be assured that the underlying Lucene engine will accept and support 
negative boosting.  Is that the case?

Regards,

Terry

  - Original Message - 
  From: Erik Hatcher 
  To: Lucene Users List 
  Sent: Wednesday, August 04, 2004 9:12 AM
  Subject: Re: Negative Boost


  On Aug 4, 2004, at 7:19 AM, Terry Steichen wrote:
  > I can't get negative boosts to work with QueryParser.  Is it possible 
  > to do so?

  Closer inspection on the parsing:

   TOKEN : {
  )+ ( "." (<_NUM_CHAR>)+ )? > : DEFAULT
  }

  where

 <#_NUM_CHAR:   ["0"-"9"] >

  So, no, negative boosts don't appear possible with QueryParser 
  currently.  I have no objections if you'd like to enhance the grammar 
  to allow for it (provided sufficient unit tests, of course).

  Erik


  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]



Re: Negative Boost

2004-08-04 Thread Terry Steichen
Near as I can tell, setting the boost to, say, 0.10, doesn't seem to do anything.

Regards,

Terry
  - Original Message - 
  From: Otis Gospodnetic 
  To: Lucene Users List 
  Sent: Wednesday, August 04, 2004 9:38 AM
  Subject: Re: Negative Boost


  You can just use boost that is < 1.0, no?

  Otis

  --- Terry Steichen <[EMAIL PROTECTED]> wrote:

  > I can't get negative boosts to work with QueryParser.  Is it possible
  > to do so?
  > 
  > TIA,
  > 
  > Terry
  > 
  > 
  > 


  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]


Re: Negative Boost

2004-08-04 Thread Daniel Naber
On Wednesday 04 August 2004 13:19, Terry Steichen wrote:

> I can't get negative boosts to work with QueryParser.  Is it possible to do
> so?

Isn't that the same as using a boost < 1, e.g. 0.1? That should be possible.

Regards
 Daniel


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Question on number of fields in a document.

2004-08-04 Thread John Z
Hi
 
I had a question related to number of fields in a document. Is there any limit to the 
number of fields you can have in an index.
 
We have around 25-30 fields per document at present, about 6 are keywords,  Around 6 
stored, but not indexed and rest of them are text, which is analyzed and indexed 
fields. We are planning on adding around 24 more fields , mostly keywords.
 
Does anyone see any issues with this? Impact to search or index ?
 
Thanks
ZJ




-
Do you Yahoo!?
New and Improved Yahoo! Mail - Send 10MB messages!

Re: Hit & Score [ Between ]

2004-08-04 Thread Doug Cutting
You could instead use a HitCollector to gather only documents with 
scores in that range.

Doug
Karthik N S wrote:
Hi 

Apologies
If I want to get all the  hits for Scores  between  0.5f  to 0.8f, 
I usally use
query = QueryParser.parse(srchkey,Fields, analyzer);
int tothits = searcher.search(query);

for (int i = 0; i
docs = hits.doc(i);
Score = hits.score(i);
 
if ((Score > 0.5f ) && (Score < 0.8f) ) {
System.out.println(" FileName  : " + docs.get("filename");
}
}

Is there any other way to Do this ,
Please Advise me..
Thx.

  WITH WARM REGARDS 
  HAVE A NICE DAY 
  [ N.S.KARTHIK] 
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Negative Boost

2004-08-04 Thread Otis Gospodnetic
You can just use boost that is < 1.0, no?

Otis

--- Terry Steichen <[EMAIL PROTECTED]> wrote:

> I can't get negative boosts to work with QueryParser.  Is it possible
> to do so?
> 
> TIA,
> 
> Terry
> 
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Negative Boost

2004-08-04 Thread Erik Hatcher
On Aug 4, 2004, at 7:19 AM, Terry Steichen wrote:
I can't get negative boosts to work with QueryParser.  Is it possible 
to do so?
Closer inspection on the parsing:
 TOKEN : {
)+ ( "." (<_NUM_CHAR>)+ )? > : DEFAULT
}
where
  <#_NUM_CHAR:   ["0"-"9"] >
So, no, negative boosts don't appear possible with QueryParser 
currently.  I have no objections if you'd like to enhance the grammar 
to allow for it (provided sufficient unit tests, of course).

Erik
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Negative Boost

2004-08-04 Thread Erik Hatcher
On Aug 4, 2004, at 7:19 AM, Terry Steichen wrote:
I can't get negative boosts to work with QueryParser.  Is it possible 
to do so?
More details please.
  - What exact query expression did you use?
  - Did you get an error?  If so, what was it?
  - What does Query.toString() output?
Erik
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Negative Boost

2004-08-04 Thread Morus Walter
Terry Steichen writes:
> I can't get negative boosts to work with QueryParser.  Is it possible to do so?
> 
If you change QueryParser ;-)

Morus

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Negative Boost

2004-08-04 Thread Terry Steichen
I can't get negative boosts to work with QueryParser.  Is it possible to do so?

TIA,

Terry




Re: search exception in servlet!Please help me

2004-08-04 Thread Erik Hatcher
My deepest apologies - I totally misspoke with my post yesterday.  
Chris, and the others, are correct - I wasn't thinking clearly and was 
confusing IndexReader.document() with Hits.doc().

So, as far as the exception goes - perhaps your servlet does not have 
access to the index because of permissions.  Maybe you're using a 
different version of Lucene between the command-line and your web 
application?

Erik
On Aug 4, 2004, at 3:14 AM, Christiaan Fluit wrote:
Erik Hatcher wrote:
Where did you get 'i'?   Keep in mind that using Hits.doc(n) intends 
'n' to be a document *id*, not the iteration through the Hits 
collection.  This is a very common mistake, and I'm guessing one 
you've made here.
I believe the Javadoc (as well as my own experience) tells otherwise:
"public final Document doc(int n) throws IOException
Returns the stored fields of the nth document in this set."
Regards,
Chris
--
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: search exception in servlet!Please help me

2004-08-04 Thread Christiaan Fluit
Erik Hatcher wrote:
Where did you get 'i'?   Keep in mind that using Hits.doc(n) intends 'n' 
to be a document *id*, not the iteration through the Hits collection.  
This is a very common mistake, and I'm guessing one you've made here.
I believe the Javadoc (as well as my own experience) tells otherwise:
"public final Document doc(int n) throws IOException
Returns the stored fields of the nth document in this set."
Regards,
Chris
--
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]