Re: Custom filter

2004-08-24 Thread roy-lucene-user
On Fri, 20 Aug 2004 20:01:36 -0400, Erik Hatcher wrote
 
 On Aug 20, 2004, at 6:48 PM, [EMAIL PROTECTED] wrote:
  We're currently in lucene 1.2... haven't moved to 1.3 yet.
 
 Skip 1.3 and go straight to 1.4.1 :)
 
 Upgrade - why not?

Well we have some MASSIVE indexes so updating needs to be planned out.  In the
meantime we continue with 1.2.  So, just for curiousity's sake... any clue on
the filter?  Or perhaps someone could clue me in on what kind of terms the
query parser creates ( and what the searcher class does with them ) when it
has something like (From:(blah OR blah2) OR To:(blah OR blah2)).  Tried to
look at the QueryParser.jj file but javacc makes my head hurt...

Roy.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Custom filter

2004-08-20 Thread roy-lucene-user
Hi guys!

I was hoping someone here could help me out with a custom filter.

We have an index of emails and do some searches on the text of an email message and 
also searches based on the email addresses in a To, From or CC.

Since we also do searches on a bunch of emails, we created a custom filter for 
searches on an array of fields for an array of values.  [code included below]

The problem we're having is that creating a query string like so:
Message:viagra AND (From:(email1 OR email2) OR To:(email1 OR email2) OR CC:(email1 OR 
email2))
would return results, but our filter combined with a query string of Message:viagra 
sometimes wouldn't.

One thing I noticed is that when the results do return with the filter, the email has 
the format of [EMAIL PROTECTED], but the one that doesn't has something like [EMAIL 
PROTECTED]

Also it might have something to do with the storage of the From or To or CC.  We don't 
parse out the email addresses before storing them.  So sometimes the value of a 
From/To/CC field might be [EMAIL PROTECTED] or local [EMAIL PROTECTED] or even 
[EMAIL PROTECTED].  Could the carrots be throwing off my filter?

I also wouldn't mind any suggestions to doing this filter better.

Here is the bits method from our custom filter:
-
final public BitSet bits( IndexReader reader ) throws IOException {
BitSet bits = new BitSet( reader.maxDoc() );

for ( int x = 0; x  fields.length; x++ ) {
for ( int y = 0; y  values.length; y++ ) {
TermDocs termDocs = reader.termDocs( new Term( fields[x], values[y] ) 
);
try {
while ( termDocs.next() ) {
bits.set( termDocs.doc() );
}
}
finally {
termDocs.close();
}
}
}
return bits;
}
-

Thanks in advance,

Roy.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Custom filter

2004-08-20 Thread Erik Hatcher
Have you considered using the built-in QueryFilter for this?   Why 
isn't it sufficient for your needs?

Erik
On Aug 20, 2004, at 6:32 PM, [EMAIL PROTECTED] wrote:
Hi guys!
I was hoping someone here could help me out with a custom filter.
We have an index of emails and do some searches on the text of an 
email message and also searches based on the email addresses in a To, 
From or CC.

Since we also do searches on a bunch of emails, we created a custom 
filter for searches on an array of fields for an array of values.  
[code included below]

The problem we're having is that creating a query string like so:
Message:viagra AND (From:(email1 OR email2) OR To:(email1 OR email2) 
OR CC:(email1 OR email2))
would return results, but our filter combined with a query string of 
Message:viagra sometimes wouldn't.

One thing I noticed is that when the results do return with the 
filter, the email has the format of [EMAIL PROTECTED], but the 
one that doesn't has something like [EMAIL PROTECTED]

Also it might have something to do with the storage of the From or To 
or CC.  We don't parse out the email addresses before storing them.  
So sometimes the value of a From/To/CC field might be 
[EMAIL PROTECTED] or local [EMAIL PROTECTED] or even 
[EMAIL PROTECTED].  Could the carrots be throwing off my filter?

I also wouldn't mind any suggestions to doing this filter better.
Here is the bits method from our custom filter:
-
final public BitSet bits( IndexReader reader ) throws IOException {
BitSet bits = new BitSet( reader.maxDoc() );
for ( int x = 0; x  fields.length; x++ ) {
for ( int y = 0; y  values.length; y++ ) {
TermDocs termDocs = reader.termDocs( new Term( 
fields[x], values[y] ) );
try {
while ( termDocs.next() ) {
bits.set( termDocs.doc() );
}
}
finally {
termDocs.close();
}
}
}
return bits;
}
-

Thanks in advance,
Roy.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Custom filter

2004-08-20 Thread Erik Hatcher
On Aug 20, 2004, at 6:48 PM, [EMAIL PROTECTED] wrote:
We're currently in lucene 1.2... haven't moved to 1.3 yet.
Skip 1.3 and go straight to 1.4.1 :)
Upgrade - why not?
Erik
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]