Hi,
We are about to start a new project, that involves maintaining a
read/write lucene index with about 2 millions of documents. I would
really appreciate if people already maintaining such volumes could post
their configurations: CPU, RAM, Operating System, any comments you
consider useful.
Doesn't that one do just that - treats fields differently, based on
their name?
yes it does, but look at the question's title
How do I write my own Analyzer?
if someone has a problem with a non-tokenized field (which was the
problem of the mail thread that started this) then he doesn't know
I think you will have to go the psuedo null/blank placeholder route.
--
Ian.
[EMAIL PROTECTED]
[EMAIL PROTECTED] (aaz) wrote
Hi,
We have a document with 2 Fields.
a) title = X
b) fieldX =
How can I do a search to only get documents where fieldX = . When I construct a
TermQuery
Uma
If you know servlets and JSP you should be able to figure
out how to integrate lucene. Presumably you have already read
the Getting Started guide?
Suggestions:
1. Create a lucene index of whatever it is you want to
search across. As a standalone program. No servlets
or JSP or
List,
Bear with me as I recount my learning process. My questions follow:
I have deployed the demo successfully, with some edits for results display and the
like (really superficial in nature overall). One thing I did do was display the query
back to the user in the result page. In doing so, I
Hello all,
I am storing the field in this fashion:
doc.add(new Field(releaseability, releaseability, true, true,
false));
so it is indexed and stored but not tokenized.
The value is Test Releaseability;
I am using the query releaseability:test
Try searching for:
+releaseability:Test +releaseability:Releaseability
Otis
--- Rob Outar [EMAIL PROTECTED] wrote:
Hello all,
I am storing the field in this fashion:
doc.add(new Field(releaseability, releaseability, true,
true,
false));
so it is indexed
Heh, funny :)
Look at the jGuru Lucene FAQ for building a custom Analyzer.
Your Analyzer has to treat some of your fields differently.
Otis
--- Spence Nichols [EMAIL PROTECTED] wrote:
Hi
I have created an index which has documents with many fields.
Eg Field.Keyword(AUTHOR, Fred Bloggs) -
On Thursday 14 November 2002 19:36, you wrote:
Hello all,
I am storing the field in this fashion:
doc.add(new Field(releaseability, releaseability, true, true,
false));
so it is indexed and stored but not tokenized.
The value is Test Releaseability;
Ironically, I just had to solve this exact problem just 10 minutes ago...
Check into javax.swing.text.html.HTMLEditorKit and
javax.swing.text.html.HTMLDocument. Here's a URL that I found helpful (the site
is Japanese, but the source code is still Java):
Well, let me know if you figure it out and I will do the same. I don't
quite understand how those classes would help out. Would you somehow use
them to create the Reader object that is passed to create the TokenStream
object?
-Original Message-
From: Craig Walls
how much help can you afford? :]
-Original Message-
From: Uma Maheswar [mailto:uma;globalleafs.com]
Sent: Wednesday, November 13, 2002 9:09 PM
To: Lucene Users List
Subject: Can any one help me?
Hello,
I am disappointed for not getting any reply evern after 4
posts. Is there
Thanks! Yeah, that's exactly what I need to do. I can really just leave
Lucene out of the picture, and just feed it the content text that I have
parsed out of the html document. Thanks again for your help!
-Original Message-
From: Craig Walls [mailto:wallsc;michaels.com]
Sent: Thursday,
If you have a look at the HtmlDocument class in the ant contributions
directory of jakarta-lucene-sandbox in Jakarta's CVS.
http://cvs.apache.org/viewcvs.cgi/jakarta-lucene-sandbox/contributions/ant/src/main/org/apache/lucene/ant/HtmlDocument.java?annotate=1.1
I wrote this and it uses JTidy to
14 matches
Mail list logo