Hardware requirements for millions of document

2002-11-14 Thread Ilya Khandamirov
Hi, We are about to start a new project, that involves maintaining a read/write lucene index with about 2 millions of documents. I would really appreciate if people already maintaining such volumes could post their configurations: CPU, RAM, Operating System, any comments you consider useful.

Re: Problems with exact matces on non-tokenized fields...

2002-11-14 Thread Stefanos Karasavvidis
Doesn't that one do just that - treats fields differently, based on their name? yes it does, but look at the question's title How do I write my own Analyzer? if someone has a problem with a non-tokenized field (which was the problem of the mail thread that started this) then he doesn't know

Re: searching on for null/blank field val

2002-11-14 Thread Ian Lea
I think you will have to go the psuedo null/blank placeholder route. -- Ian. [EMAIL PROTECTED] [EMAIL PROTECTED] (aaz) wrote Hi, We have a document with 2 Fields. a) title = X b) fieldX = How can I do a search to only get documents where fieldX = . When I construct a TermQuery

Re: Can any one help me?

2002-11-14 Thread Ian Lea
Uma If you know servlets and JSP you should be able to figure out how to integrate lucene. Presumably you have already read the Getting Started guide? Suggestions: 1. Create a lucene index of whatever it is you want to search across. As a standalone program. No servlets or JSP or

Security question with demo... feedback sought.

2002-11-14 Thread Stone, Timothy
List, Bear with me as I recount my learning process. My questions follow: I have deployed the demo successfully, with some edits for results display and the like (really superficial in nature overall). One thing I did do was display the query back to the user in the result page. In doing so, I

Not getting any results from query

2002-11-14 Thread Rob Outar
Hello all, I am storing the field in this fashion: doc.add(new Field(releaseability, releaseability, true, true, false)); so it is indexed and stored but not tokenized. The value is Test Releaseability; I am using the query releaseability:test

Re: Not getting any results from query

2002-11-14 Thread Otis Gospodnetic
Try searching for: +releaseability:Test +releaseability:Releaseability Otis --- Rob Outar [EMAIL PROTECTED] wrote: Hello all, I am storing the field in this fashion: doc.add(new Field(releaseability, releaseability, true, true, false)); so it is indexed

Re: How do I stop the QueryParser from tokenising fields?

2002-11-14 Thread Otis Gospodnetic
Heh, funny :) Look at the jGuru Lucene FAQ for building a custom Analyzer. Your Analyzer has to treat some of your fields differently. Otis --- Spence Nichols [EMAIL PROTECTED] wrote: Hi I have created an index which has documents with many fields. Eg Field.Keyword(AUTHOR, Fred Bloggs) -

Re: Not getting any results from query

2002-11-14 Thread Ype Kingma
On Thursday 14 November 2002 19:36, you wrote: Hello all, I am storing the field in this fashion: doc.add(new Field(releaseability, releaseability, true, true, false)); so it is indexed and stored but not tokenized. The value is Test Releaseability;

Re: HTML Analyzer?

2002-11-14 Thread Craig Walls
Ironically, I just had to solve this exact problem just 10 minutes ago... Check into javax.swing.text.html.HTMLEditorKit and javax.swing.text.html.HTMLDocument. Here's a URL that I found helpful (the site is Japanese, but the source code is still Java):

RE: HTML Analyzer?

2002-11-14 Thread Lichty, Kent
Well, let me know if you figure it out and I will do the same. I don't quite understand how those classes would help out. Would you somehow use them to create the Reader object that is passed to create the TokenStream object? -Original Message- From: Craig Walls

RE: Can any one help me?

2002-11-14 Thread Sale, Doug
how much help can you afford? :] -Original Message- From: Uma Maheswar [mailto:uma;globalleafs.com] Sent: Wednesday, November 13, 2002 9:09 PM To: Lucene Users List Subject: Can any one help me? Hello, I am disappointed for not getting any reply evern after 4 posts. Is there

RE: HTML Analyzer?

2002-11-14 Thread Lichty, Kent
Thanks! Yeah, that's exactly what I need to do. I can really just leave Lucene out of the picture, and just feed it the content text that I have parsed out of the html document. Thanks again for your help! -Original Message- From: Craig Walls [mailto:wallsc;michaels.com] Sent: Thursday,

Re: HTML Analyzer?

2002-11-14 Thread Erik Hatcher
If you have a look at the HtmlDocument class in the ant contributions directory of jakarta-lucene-sandbox in Jakarta's CVS. http://cvs.apache.org/viewcvs.cgi/jakarta-lucene-sandbox/contributions/ant/src/main/org/apache/lucene/ant/HtmlDocument.java?annotate=1.1 I wrote this and it uses JTidy to