Re: QueryParser

2002-11-20 Thread Lee Mallabone
On Tue, 2002-11-19 at 23:07, stephane vaucher wrote: 
 I've tested the following. I don't know if I'm hitting expected 
 behaviour, but it seems suspicious:

Hi,

You might like to see this thread to lucene-dev from a couple of months
ago:
http://www.mail-archive.com/lucene-dev@jakarta.apache.org/msg01843.html

I provided a test patch and what I thought was a fix, but as Doug
explains in his replies, setBoost() isn't currently implemented for 
boolean queries.

Regards,

-- 
Lee Mallabone.

 
  public void testPhraseBoost() throws Exception{
  assertQueryEquals((a AND b) OR c, null, (+a +b) c);
  assertQueryEquals((a AND b)^2 c, null, (+a +b)^2.0 c);
  }
 
 -- junit result
 
 There was 1 failure:
 1) 
 
testPhraseBoost(org.apache.lucene.queryParser.TestQueryParser)junit.framework.AssertionFailedError:
 
 Query /(a AND b)^2 c/ yielded /+a +b/, expecting /(+a +b)^2.0 c/
 at 
 org.apache.lucene.queryParser.TestQueryParser.assertQueryEquals(Unknown 



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




1

2002-11-20 Thread Tian LUO
  



-
Get a bigger mailbox -- choose a size that fits your needs.



RE: Several fields with the same name

2002-11-20 Thread Otis Gospodnetic
Maybe you can show the actual output of this piece of code.
What do you get?  Show...

--- Rob Outar [EMAIL PROTECTED] wrote:
 Otis,
 
   Tried this:
 
 f = doc.get(key);
 
 while (f != null ) {
 l.add(f);
 //get next value for same key
 f = doc.get(key);
 System.out.println(f);
 }
 
 I got an outofmemory error after a while so it looks like it will
 keep
 returning the same value, and not null;
 
 Thanks,
 
 Rob
 
 
 -Original Message-
 From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
 Sent: Wednesday, November 06, 2002 2:57 PM
 To: Lucene Users List
 Subject: Re: Several fields with the same name
 
 
 Looking at the source if looks like you can just call it multiple
 times
 until it returns null.
 
 Otis
 
 --- Rob Outar [EMAIL PROTECTED] wrote:
  Hello all,
 
  I have a relationship where for one key there are many values,
  basically a
  1 to many relationship.  For example with the key = name, value =
  bob, jim,
  etc..
 
  When a client wants all the values that have been associated with
  the field
  name, how would I get that?  The javadoc for Document.get(String
  name)
  states:
 
  Returns the string value of the field with the given name if any
  exist in
  this document, or   null.   If multiple fields may exist with this
  name, this
  method returns the last added   suchadded.
 
  I don't need the last field's value, I need all values associated
  with that
  field.
 
  Any help would be appreciated.
 
  Thanks,
 
  Rob
 
 
 
  --
  To unsubscribe, e-mail:
  mailto:[EMAIL PROTECTED]
  For additional commands, e-mail:
  mailto:[EMAIL PROTECTED]
 
 
 
 __
 Do you Yahoo!?
 HotJobs - Search new jobs daily now
 http://hotjobs.yahoo.com/
 
 --
 To unsubscribe, e-mail:
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 
 
 --
 To unsubscribe, e-mail:  
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 


__
Do you Yahoo!?
Yahoo! Web Hosting - Let the expert host your site
http://webhosting.yahoo.com

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Problem building Lucene

2002-11-20 Thread Nita Deshpande
Hi:

I downloaded the lucene source and have been trying to build using
ant. I am getting the following error message:
--
-
Buildfile: build.xml

init:

javacc_check:

compile:
   [javacc] java was not found in
/usr/local/apps/java/bin/sparc/native_threads/java

BUILD FAILED

/users/science/user/lucene/lucene-1.2-src/build.xml:96: java failed
with return code 1
--
-

The JavaCC version is 2.1. Platform is Sun sparc solaris. JAVA_HOME
env variable has been set to /usr/local/apps/java.
Any help will be most appreciated.

Thanks
ND

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Book

2002-11-20 Thread William W

I would like to buy a book about Lucene.
Who could write it ? : )

_
STOP MORE SPAM with the new MSN 8 and get 2 months FREE* 
http://join.msn.com/?page=features/junkmail


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]



A little date help

2002-11-20 Thread Rob Outar
Hello all,

I am indexing the date using the java.io.file.lastModified() method

doc.add(new Field(MODIFIED_DT,
DateField.timeToString(f.lastModified()), true, true, true));

I am trying to search on this field, but I am having a hard time formatting
the date correctly.  I am not sure what date format lastModified() uses so
trying to come up with a query in milliseconds for the above date field is
difficult.

Has anyone run into this problem?  Is there an easier way to do this?

Let me know,

Rob


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: Book

2002-11-20 Thread Ype Kingma
William,

On Wednesday 20 November 2002 21:14, you wrote:
 I would like to buy a book about Lucene.
 Who could write it ? : )
AFAIK there is no book, but some articles might help:

http://citeseer.nj.nec.com/cs?q=doug+cuttingsubmit=Search+Documentscs=1

Optimizations for Dynamic Inverted Index Maintenance and
An Object-Oriented Architecture for Text Retrieval are the ones I like best.

Have fun,
Ype



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Stress/scalability testing Lucene

2002-11-20 Thread roy-lucene-user
Ah, for some reason i thought none of the Lucene methods were thread safe,
or is this only in the case of reading and writing at the same time?  I
thought I read this in the FAQ.

Roy.

-Original Message-
From: Doug Cutting [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, November 20, 2002 5:04 PM
To: Lucene Users List
Subject: Re: Stress/scalability testing Lucene


* Replies will be sent through Spamex to [EMAIL PROTECTED]
* For additional info click - http://www.spamex.com/i/?v=886513

Justin Greene wrote:
 We created a thread pool to read and parse the email
 messages.  10 threads seems to be the magic number here for us.  We then
 created a queue of messages to be indexed onto which we push the parsed
 messages and have a single thread adding messages to the index.

IndexWriter.addDocument(Document) is thread safe, so you don't need a 
separate indexing thread.  So long as your analyzer is thread safe, you 
can index each messages in the thread that parses it, for even greater 
parallelism.

Doug


--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]


This email and any attachments are confidential and may be 
legally privileged. No confidentiality or privilege is waived 
or lost by any transmission in error.  If you are not the 
intended recipient you are hereby notified that any use, 
printing, copying or disclosure is strictly prohibited.  
Please delete this email and any attachments, without 
printing, copying, forwarding or saving them and notify the 
sender immediately by reply e-mail.  Zurich Capital Markets 
and its affiliates reserve the right to monitor all e-mail 
communications through its networks.  Unless otherwise 
stated, any pricing information in this e-mail is indicative 
only, is subject to change and does not constitute an offer 
to enter into any transaction at such price and any terms in 
relation to any proposed transaction are indicative only and 
subject to express final confirmation.



RE: Stress/scalability testing Lucene

2002-11-20 Thread Otis Gospodnetic
Reding and writing at the same time is okay.  Only one thread can
modify the index at a time.

Otis

--- [EMAIL PROTECTED] wrote:
 Ah, for some reason i thought none of the Lucene methods were thread
 safe,
 or is this only in the case of reading and writing at the same time? 
 I
 thought I read this in the FAQ.
 
 Roy.
 
 -Original Message-
 From: Doug Cutting [mailto:[EMAIL PROTECTED]]
 Sent: Wednesday, November 20, 2002 5:04 PM
 To: Lucene Users List
 Subject: Re: Stress/scalability testing Lucene
 
 
 * Replies will be sent through Spamex to
 [EMAIL PROTECTED]
 * For additional info click - http://www.spamex.com/i/?v=886513
 
 Justin Greene wrote:
  We created a thread pool to read and parse the email
  messages.  10 threads seems to be the magic number here for us.  We
 then
  created a queue of messages to be indexed onto which we push the
 parsed
  messages and have a single thread adding messages to the index.
 
 IndexWriter.addDocument(Document) is thread safe, so you don't need a
 
 separate indexing thread.  So long as your analyzer is thread safe,
 you 
 can index each messages in the thread that parses it, for even
 greater 
 parallelism.
 
 Doug
 
 
 --
 To unsubscribe, e-mail:
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 
 
 This email and any attachments are confidential and may be 
 legally privileged. No confidentiality or privilege is waived 
 or lost by any transmission in error.  If you are not the 
 intended recipient you are hereby notified that any use, 
 printing, copying or disclosure is strictly prohibited.  
 Please delete this email and any attachments, without 
 printing, copying, forwarding or saving them and notify the 
 sender immediately by reply e-mail.  Zurich Capital Markets 
 and its affiliates reserve the right to monitor all e-mail 
 communications through its networks.  Unless otherwise 
 stated, any pricing information in this e-mail is indicative 
 only, is subject to change and does not constitute an offer 
 to enter into any transaction at such price and any terms in 
 relation to any proposed transaction are indicative only and 
 subject to express final confirmation.
 


__
Do you Yahoo!?
Yahoo! Web Hosting - Let the expert host your site
http://webhosting.yahoo.com

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: Searching Ranges

2002-11-20 Thread Alex Winston
doug,
  if you happen to remember this thread, i was wanting to know if you
had any thoughts on improving this search in the situation below, my
temp fix does not work in all situations, so i am back to square one.

  i have completely gutted the RangeQuery and created an additional
RangeScorer to help eliminate some of the overheard incurred in the
special situation below, but the search times are still unacceptable. 
currently i have reduced the logic down to simply iterating over the set
of terms between the range and returning the set of termDocs for each,
and then in turn maintaining an [] of the results.  although my
implementation is substantially faster than before it is still very
slow.  my thought was that i might be able to accomplish a more
efficient range query at the Reader level, any thoughts?

  i am certain that some of the redundant iteration can be eliminated i
am just not sure how.

thanks
alex



 Alex Winston wrote:
  lets say that i have a document named d1, which contains a field named
  references.  within the references field i maintain a list of terms
  that represent my range from 001-005, more specifically the field would
  contain the terms 001 002 003 004 005.
 
  i would now like to search this range to determine if it falls within
  the range 003-010, so my query would look like references:[003 010].



signature.asc
Description: This is a digitally signed message part


Re: Stress/scalability testing Lucene

2002-11-20 Thread Doug Cutting
Otis Gospodnetic wrote:

Reding and writing at the same time is okay.  Only one thread can
modify the index at a time.


Almost.  Only one process can modify it at a time, other processes will 
be prevented by the write.lock file.  Multiple threads can modify an 
index simultaneously.  The bulk of the work of the updates will be 
serialized by synchronization in IndexWriter, but the analysis of the 
text into tokens is parallelizeable.

Doug

--- [EMAIL PROTECTED] wrote:


Ah, for some reason i thought none of the Lucene methods were thread
safe,
or is this only in the case of reading and writing at the same time? 
I
thought I read this in the FAQ.

Roy.

-Original Message-
From: Doug Cutting [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, November 20, 2002 5:04 PM
To: Lucene Users List
Subject: Re: Stress/scalability testing Lucene


* Replies will be sent through Spamex to
[EMAIL PROTECTED]
* For additional info click - http://www.spamex.com/i/?v=886513

Justin Greene wrote:

We created a thread pool to read and parse the email
messages.  10 threads seems to be the magic number here for us.  We


then


created a queue of messages to be indexed onto which we push the


parsed


messages and have a single thread adding messages to the index.


IndexWriter.addDocument(Document) is thread safe, so you don't need a

separate indexing thread.  So long as your analyzer is thread safe,
you 
can index each messages in the thread that parses it, for even
greater 
parallelism.

Doug


--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]


This email and any attachments are confidential and may be 
legally privileged. No confidentiality or privilege is waived 
or lost by any transmission in error.  If you are not the 
intended recipient you are hereby notified that any use, 
printing, copying or disclosure is strictly prohibited.  
Please delete this email and any attachments, without 
printing, copying, forwarding or saving them and notify the 
sender immediately by reply e-mail.  Zurich Capital Markets 
and its affiliates reserve the right to monitor all e-mail 
communications through its networks.  Unless otherwise 
stated, any pricing information in this e-mail is indicative 
only, is subject to change and does not constitute an offer 
to enter into any transaction at such price and any terms in 
relation to any proposed transaction are indicative only and 
subject to express final confirmation.




__
Do you Yahoo!?
Yahoo! Web Hosting - Let the expert host your site
http://webhosting.yahoo.com

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]





--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Help on creating and maintaining an index that changes

2002-11-20 Thread Joe Consumer
Hi,

I'm a lucene newbie.  I wanted to ask someone's expert
opinion on how to attack this issue.  I have a set of
documents (a catalog) that many clients want to
register with the search server.  While those clients
are reachable their catalog should be available, but
if they log off or disappear then I want to remove
their catalog from the index.

Currently, I have this implemented with two hashmaps. 
Their catalog is assigned a unique key in one hashmap,
and their catalog contents is parsed out into
keywords, and put into the master hashmap which
indexes into the other one.  When a client leaves I
remove their catalog from the first hashmap, and I
don't clean up the references in the master hashmap. 
If a search indexes a key that is null in the first
hashmap I remove the reference at that point from the
master Hashmap.

I want to do something similiar with Lucene, but I
don't know how to approach it.  I thought maybe
keeping the first hashmap as is, and building a
Directory in lucene that replaces the master Hashmap. 
 When I get hits back from lucene I look them up in
the first hashmap, and return those.

How do I put the needed information into Directory so
I can look them up in the first hashmap.  I would need
the unique id identifying the client, and a key that
identifies the document that the client has.

Then how do I clean up the Directory when a client is
not available?  How do I remove a document from
Lucene's Directory?

thanks in advance,
charlie


__
Do you Yahoo!?
Yahoo! Web Hosting - Let the expert host your site
http://webhosting.yahoo.com

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: Searching for keyword 0 [zero] using TermQuery

2002-11-20 Thread Otis Gospodnetic
Is 0 in the list of your stop words?

Otis

--- Eric Fixler [EMAIL PROTECTED] wrote:
 Hello.  I have a field in an index that stores item id's which can be
 
 zero.  I use a TermQuery to search for these, and everything works
 fine 
 except when I'm searching for things with id 0; these entries return
 no 
 results.
 
 The index appears to have the correct data and the query looks proper
 
 as far as I can tell.
 
 Is this a known issue?  Can anyone suggest a possible workaround?
 
 thanks
 eric
 
 
 --
 To unsubscribe, e-mail:  
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 


__
Do you Yahoo!?
Yahoo! Mail Plus – Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: Book

2002-11-20 Thread Otis Gospodnetic
I wrote a few articles that I'm trying to publish somewhere now.
Cheaper than a book :)

Otis

--- William W [EMAIL PROTECTED] wrote:
 
 I would like to buy a book about Lucene.
 Who could write it ? : )


__
Do you Yahoo!?
Yahoo! Mail Plus – Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Fun project?

2002-11-20 Thread Robert A. Decker
I wish I had time to work on this for fun, but I was thinking about what
could be a fun lucene project...

One could build a peer-to-peer document search application. Each client
would index the documents on its harddrive, or documents in a particular
directory. When the user at the computer does a search it will look at the
documents on its harddrive, but also send out a request for the search on
the P2P network.

First though, are there any P2P java frameworks out there? One could build
one, perhaps with OpenJMS, but it would be nice if one already existed.

Hmm... if anyone else thinks this would be cool I'd be willing to work on
this with you. 


thanks,
Robert A. Decker

http://www.robdecker.com/
http://www.planetside.com/



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: Searching for keyword 0 [zero] using TermQuery

2002-11-20 Thread Eric Fixler
Hi.  Thanks for the reply.

I'm just using StandardAnalyzers with the no-args contructor; 0 does 
not appear to be one of the STOP_WORDS.

eric



On Wednesday, November 20, 2002, at 08:56 PM, Otis Gospodnetic wrote:

Is 0 in the list of your stop words?

Otis

--- Eric Fixler [EMAIL PROTECTED] wrote:

Hello.  I have a field in an index that stores item id's which can be

zero.  I use a TermQuery to search for these, and everything works
fine
except when I'm searching for things with id 0; these entries return
no
results.



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: Searching for keyword 0 [zero] using TermQuery : PROBLEM SOLVED

2002-11-20 Thread Eric Fixler
My bug; checking to see if form fields not empty was a little but 
over-aggressive.

As always, thanks for the help (and thanks for lucene!)

eric



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]