Re: Pool of IndexReaders or Pool of Searchers?

2004-07-11 Thread Anson Lau
Hi,

When I did some load testing on a lucene powered search app, using a
pool of index searchers doesn't give me any more search per second
than just using a singleton index searcher.

Anson


Quoting [EMAIL PROTECTED]:

 Hi,
 
 I have multiple threads reading an index.  Should they all be
 using
 the same IndexReader and using a pool of IndexSearchers?  Or
 should they be
 using a pool of IndexReaders?
 
 Basically, one reader or many?
 
 Thanks.
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Field.java - STORED, NOT_STORED, etc...

2004-07-11 Thread Kevin A. Burton
I've been working with the Field class doing index conversions between 
an old index format to my new external content store proposal (thus the 
email about the 14M convert).

Anyway... I find the whole Field.Keyword, Field.Text thing confusing.  
The main problem is that the constructor to Field just takes booleans 
and if you forget the ordering of the booleans its very confusing.

new Field( name, value, true, false, true );
So looking at that you have NO idea what its doing without fetching javadoc.
So I added a few constants to my class:
new Field( name, value, NOT_STORED, INDEXED, NOT_TOKENIZED );
which IMO is a lot easier to maintain.
Why not add these constants to Field.java:
   public static final boolean STORED = true;
   public static final boolean NOT_STORED = false;
   public static final boolean INDEXED = true;
   public static final boolean NOT_INDEXED = false;
   public static final boolean TOKENIZED = true;
   public static final boolean NOT_TOKENIZED = false;
Of course you still have to remember the order but this becomes a lot 
easier to maintain.

Kevin
--
Please reply using PGP.
   http://peerfear.org/pubkey.asc
   
   NewsMonster - http://www.newsmonster.org/
   
Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965
  AIM/YIM - sfburtonator,  Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
 IRC - freenode.net #infoanarchy | #p2p-hackers | #newsmonster

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


How to get all the values of a field in all the index

2004-07-11 Thread clibois


Hello,
I have indexed a database which contains a field category.
I would like to get an enumeration of all the category of the index.
So for that i need to get all possible different value of this field inside 
the index. For the moment I use  the terms() method,check if for each term if 
it's a category term and if so get the value of this term.
However it's rather inefficient. Do you have any solution for that?
Claude Libois

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Is it possible to delete a term?

2004-07-11 Thread clibois
Hello,
I am still working on my categorizing tools and I am implementing a 
dimensional reduction. I would like to reduce my index to a subset of his 
terms. So i was asking me if it's possible to delete not a document but a term?
Maybe is there any other solution to reduce my number of terms?
What I exactly need is to get a TermFreqVector that can contains only  terms 
that i have define before. Is it possible to do that whith a filtered query? 
If so could you give me a code example because i have difficulties to see how?
Thanks in advance
Claude Libois


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Why is Field.java final?

2004-07-11 Thread Doug Cutting
Kevin A. Burton wrote:
I was going to create a new IDField class which just calls super( name, 
value, false, true, false) but noticed I was prevented because 
Field.java is final?
You don't need to subclass to do this, just a static method somewhere.
Why is this?  I can't see any harm in making it non-final...
Field and Document are not designed to be extensible.  They are 
persisted in such a way that added methods are not available when the 
field is restored.  In other words, when a field is read, it always 
constructs an instance of Field, not a subclass.

Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Field.java - STORED, NOT_STORED, etc...

2004-07-11 Thread Doug Cutting
Kevin A. Burton wrote:
So I added a few constants to my class:
new Field( name, value, NOT_STORED, INDEXED, NOT_TOKENIZED );
which IMO is a lot easier to maintain.
Why not add these constants to Field.java:
   public static final boolean STORED = true;
   public static final boolean NOT_STORED = false;
   public static final boolean INDEXED = true;
   public static final boolean NOT_INDEXED = false;
   public static final boolean TOKENIZED = true;
   public static final boolean NOT_TOKENIZED = false;
Of course you still have to remember the order but this becomes a lot 
easier to maintain.
It would be best to get the compiler to check the order.
If we change this, why not use type-safe enumerations:
http://www.javapractices.com/Topic1.cjp
The calls would look like:
new Field(name, value, Stored.YES, Indexed.NO, Tokenized.YES);
Stored could be implemented as the nested class:
public final class Stored {
  private Stored() {}
  public static final Stored YES = new Stored();
  public static final Stored NO = new Stored();
}
and the compiler would check the order of arguments.
How's that?
Doug

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Field.java - STORED, NOT_STORED, etc...

2004-07-11 Thread Doug Cutting
Doug Cutting wrote:
The calls would look like:
new Field(name, value, Stored.YES, Indexed.NO, Tokenized.YES);
Stored could be implemented as the nested class:
public final class Stored {
  private Stored() {}
  public static final Stored YES = new Stored();
  public static final Stored NO = new Stored();
}
Actually, while we're at it, Indexed and Tokenized are confounded.  A 
single entry would be better, something like:

public final class Index {
  private Index() {}
  public static final Index NO = new Index();
  public static final Index TOKENIZED = new Index();
  public static final Index UN_TOKENIZED = new Index();
}
then calls would look like just:
new Field(name, value, Store.YES, Index.TOKENIZED);
BTW, I think Stored would be better named Store too.
BooleanQuery's required and prohibited flags could get the same 
treatment, with the addition of a nested class like:

public final class Occur {
  private Occur() {}
  public static final Occur MUST_NOT = new Occur();
  public static final Occur SHOULD = new Occur();
  public static final Occur MUST = new Occur();
}
and adding a boolean clause would look like:
booleanQuery.add(new TermQuery(...), Occur.MUST);
Then we can deprecate the old methods.
Comments?
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Field.java - STORED, NOT_STORED, etc...

2004-07-11 Thread Tatu Saloranta
On Sunday 11 July 2004 10:03, Doug Cutting wrote:
 Doug Cutting wrote:
  The calls would look like:
 
  new Field(name, value, Stored.YES, Indexed.NO, Tokenized.YES);
 
.
 Actually, while we're at it, Indexed and Tokenized are confounded.  A
 single entry would be better, something like:
...
 then calls would look like just:

 new Field(name, value, Store.YES, Index.TOKENIZED);
...
 and adding a boolean clause would look like:

 booleanQuery.add(new TermQuery(...), Occur.MUST);

 Then we can deprecate the old methods.

 Comments?

I was about to suggest this, instead of int/boolean constants, since it is a 
recommended good practice, and allows better type safety (until JDK 1.5's 
real enums at least). I would prefer this over un-typesafe consts; although
even just defining and using simple consts in itself would be an improvement 
over existing situation.

Another possibility (or maybe complementary approach) would be to just 
completely do away with constructor access; make the constructors private or 
protected, and only allow factory methods to be used externally. This would 
have the benefit of even better readability: minimum number of arguments 
(method name would replace one or two args) and full type checking. Plus it'd 
be easier to modify implementations should that become necessary. Factory 
methods are especially useful for classes like Field, that are not designed 
to be sub-classed.

-+ Tatu +-


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]