I cannot remember the answer I got, but I asked the same question after
the code was changed to put locks in java.io.tmpdir.
Because I have an application that deals with a lot of indices
simultaneously, I felt like this will make things more difficult in
cases where you have stale locks, etc.
Try
I do not know enough about the German stemmer included with Lucene, but
I can suggest that you look at the Snowball stemmers. Take a look at
the Lucene Sandbox (link on Lucene's home page) to see how they can be
used with Lucene.
Otis
--- Marius Seiceanu [EMAIL PROTECTED] wrote:
Hello!
as they are currently.
Otis
--- Erik Hatcher [EMAIL PROTECTED] wrote:
It seems to be the issue mentioned here as well:
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=18410
On Wednesday, October 8, 2003, at 09:41 PM, Otis Gospodnetic wrote:
Answer to question comment: possibly
Uh, this message was flagged nicely in my Lucene folder, but I just got
to it now. The link should show up in the 'Articles, etc.' section on
Lucene's pages next time they are refreshed.
Otis
--- Jeff Linwood [EMAIL PROTECTED] wrote:
Hi,
I wrote a short introductory article about Lucene on
Thank you, I added this to out 'patch queue'.
If you have a more recent version, feel free to attach it to the
enhancement report that you received in email.
Otis
--- Anthony Eden [EMAIL PROTECTED] wrote:
Version 1.0 of the DBDirectory library, which implements a Directory
which can store
Please note that the class that caused the error,
org.apache.lucene.search.IndexOrderSearcher, is not really a Lucene
class. You got that class from http://sf.net/projects/weblucene, most
likely.
Otis
--- lhelper [EMAIL PROTECTED] wrote:
Hi.
I get a strange problem with my web application
The CVS version of Lucene has a patch that allows one to use a
'Compound Index' instead of the traditional one. This reduces the
number of open files. For more info, see/make the Javadocs for
IndexWriter.
Otis
--- Tate Avery [EMAIL PROTECTED] wrote:
You might have trouble with too many open
A very rough and simple 'add a single document to the index' test shows
that the Compound Index is marginally slower than the traditional one.
I did not test searching.
Otis
--- Eric Jain [EMAIL PROTECTED] wrote:
The CVS version of Lucene has a patch that allows one to use a
'Compound Index'
Hm, beat me.
The code in question seems to be:
public RAMInputStream(RAMFile f) {
file = f;
length = file.length;
}
...which is called from:
/** Returns a stream reading an existing file. */
public final InputStream openFile(String name) {
RAMFile file =
Then we agree, and it is StopFilter that needs to be patched to take
into account the number of removed terms, and add appropriate
positional info to each term.
Otis
--- Erik Hatcher [EMAIL PROTECTED] wrote:
On Tuesday, October 21, 2003, at 07:31 PM, Otis Gospodnetic wrote:
So phone boy
, at 18:06 Europe/Amsterdam, Otis
Gospodnetic
wrote:
Since 'files' is a Hashtable, neither the key nor the value (file)
can
be null, even though the NPE in RAMInputStream constructor implies
that
file was null.
Yep... pretty weird... but looking at openFile(String name)... could
--- William W [EMAIL PROTECTED] wrote:
Hi Erik,
Why don't you write a book about Lucene ? : )
Maybe he already is writing it. :)
Regarding your original question, I don't think anyone will be able to
answer it, as it is quite general. I suggest you describe pieces of
your code that concern
I've seen people just an additional BooleanClause and join it with the
original query using an AND.
Otis
--- Stephan Melchior [EMAIL PROTECTED] wrote:
Hi,
I'm new with Lucene and need help,
My Problem:
I successfully performed a query via
hits = searcher.search(query);
Now i want to
Hello,
Erik Hatcher and I are in the process of writing a book about Lucene.
Among other things, we would like to include 'Lucene Patterns' /
'Lucene Best Practices' type of material in the book.
If you feel that you have observed and/or implemented Lucene usage
patterns that look like they
Apparently so :(
http://www.google.com/search?q=lucene+%22term+out+of+order%22
Otis
--- Victor Hadianto [EMAIL PROTECTED] wrote:
Hi all,
I'm using Lucene.Net but seems appropriate to post here as well. I
have been
getting this exception Term out of order every now and then while
doing a
Look at the Benchmarks page on Lucene's site.
It is not complete (heh, it can never be complete), but it will give
you some ideas about Lucene's performance.
Feel free to submit your benchmarks, using this template:
http://jakarta.apache.org/lucene/docs/benchmarktemplate.xml
Thank you,
Otis
.) Is this a
problem
or not?
Regards,
Terry Steichen
- Original Message -
From: Otis Gospodnetic [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Wednesday, October 29, 2003 7:09 PM
Subject: Re: Term out of order.
Apparently so :(
http://www.google.com/search?q
I believe a person just sent an email with a solution yesterday or the
day before. Look for a message with MultiFieldQueryParser in its
Subject.
Otis
--- Maurice Coyle [EMAIL PROTECTED] wrote:
are there any plans to implement some sort of
MultiFieldQueryParser.setOperator(int) method so folk
Wow, with 16GB RAM, I would definitely load the index into RAM. You
can use RAMDirectory(Directory) constructor for that.
As for RAMDrives. I have no experience with those, but I have heard
of some people using ramfs under Linux. Ramfs is a memory based
filesystem. Mount it and you have
--- jt oob [EMAIL PROTECTED] wrote:
Thank you for the replies!
My indexes are currently looking like they might be 12GB when
finished
on the current run.
I have spotted a tool on the lucene site for listing the most
frequently occuring words in the index. Currently I am using the
Use a single instance of IndexSearcher.
When you detect that the index has changed, through that instance (see
javadoc for the exact method name, I don't recall its exact name now),
discard that instance, and make a new one.
Do this check before every query or every X unit of time if you don't
--- Erik Hatcher [EMAIL PROTECTED] wrote:
On Thursday, November 6, 2003, at 02:44 PM, Chong, Herb wrote:
it's the line with the close(). so the remedy then is to make sure
that it is called only once. what is the recommended way to process
two folders worth of documents then? do i
Excellent.
If you have time, please contribute a patch for the terse and vague
documentation, so others don't have to suffer.
Thanks,
Otis
--- Chong, Herb [EMAIL PROTECTED] wrote:
i'm running in a single thread. the demo app is pretty vague on
things and expects me to read the detailed
--- Leo Galambos [EMAIL PROTECTED] wrote:
Marcel Stör wrote:
Hi
As everybody seems to be so exited about it, would someone please be
so kind to explain
what document based clustering is?
AFAIK, document clustering consists of detection of documents with
similar content (similar
Thanks for the clarification, Stefan. I should have known that... :)
Otis
--- Stefan Groschupf [EMAIL PROTECTED] wrote:
Hi,
How is document clustering different/related to text categorization?
Clustering: try to find own categories and put documents that match
in it.
You group all
1). If I delete a term using an IndexReader, can I use an existing
IndexWriter to write to the index? Or do I need to close and reopen
the IndexWriter?
No. You should close IndexWriter first, then open IndexReader, then
call delete, then close IndexReader, and then open a new IndexWriter.
Ernesto, it looks like something got stripped. A ZIP file should make
it to the list. If not, maybe you can post it somewhere.
Could you also tell us a bit about this code? Is it better than
existing PDF/Word parsing solutions? Pure Java? Uses POI?
Thanks,
Otis
--- Ernesto De Santis
Correct. write.lock is used for that.
Otis
--- Morus Walter [EMAIL PROTECTED] wrote:
Otis Gospodnetic writes:
No, it is not safe. You should close the IndexWriter, then delete
the
document and close IndexReader, and then get a new IndexWriter and
continue writing.
IIRC lucene
Lucene does not implement vector space model.
Otis
--- [EMAIL PROTECTED] wrote:
Hi,
does Lucene implement a Vector Space Model? If yes, does anybody have
an
example of how using it?
Cheers,
Ralf
--
NEU FÜR ALLE - GMX MediaCenter - für Fotos, Musik, Dateien...
Fotoalbum, File
No, sorry.
Otis
--- Ralf Bierig [EMAIL PROTECTED] wrote:
Does Lucene implement Latent Semantic Indexing? Examples?
Ralf
--
NEU FÜR ALLE - GMX MediaCenter - für Fotos, Musik, Dateien...
Fotoalbum, File Sharing, MMS, Multimedia-Gruß, GMX FotoService
Jetzt kostenlos anmelden unter
an IndexWriter to delete an item?
On Tue, Nov 11, 2003 at 02:46:37PM -0800, Otis Gospodnetic wrote:
1). If I delete a term using an IndexReader, can I use an
existing
IndexWriter to write to the index? Or do I need to close and
reopen
the IndexWriter?
No. You should close
?
Which begs the question: why do you need to use an IndexReader rather
than an IndexWriter to delete an item?
On Tue, Nov 11, 2003 at 02:46:37PM -0800, Otis Gospodnetic wrote:
1). If I delete a term using an IndexReader, can I use an
existing
IndexWriter to write to the index? Or do I
I am not using RAMDirectory due to the large size of
index file. the index generated on hard disc is 1.57G
for 1 million documents, each document has average 500
terms. I am using Field.UnStored(fieldName, terms), so
i beliece I am not storing the documents, just the
index. (is that right?)
Multiple threads against the same index or multiple indices - no
advantage - think about the mechanical parts involved (disk head).
Multiple threads against indices on different disks (not just
paritions!) - yes, that would be faster.
Reading the index from the disk is the bottleneck, not the
Erik is referring to the VERY latest version - the CVS :)
Otis
--- Erik Hatcher [EMAIL PROTECTED] wrote:
On Thursday, November 13, 2003, at 06:46 PM, Tomcat Programmer
wrote:
Hopefully the dev group will consider refactoring the
code so that when its doing the lexing it will throw
Dmitry once contributed a nice beefy patch that added Term Vector
support to Lucene. While we never integrated the changes (for no good
reason), I do recall that the patch was nice and elegant, because it
allowed one to turn Term Vector support on/off at indexing time.
If turned on, Lucene would
Sorry, no firm date. However, 1.3 RC2 is pretty solid, so I suggest
you just use that until 1.3 final is out.
Otis
--- Tun Lin [EMAIL PROTECTED] wrote:
Hi,
Anyone knows when the full version of Lucene version 1.3 will be
released?
Please advise.
Thanks.
It sounds like you missed this:
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/DefaultSimilarity.html
You can write your own implementations and use it during indexing and
searching.
Otis
--- Ralf B [EMAIL PROTECTED] wrote:
Hi,
I am a very beginner of Lucene und
1.3RC2.
Otis
--- Scott Smith [EMAIL PROTECTED] wrote:
If you had to be production in January, would you be using 1.3RC2 or
1.2?
-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Monday, November 24, 2003 4:03 AM
To: Lucene Users List; [EMAIL PROTECTED
You have to look at Analyzers.
Figure out which one you are using and why, and see if you should be
using a different one or even write your own.
Some of the Analyzers break input on certain tokens (e.g. . or _ or
...), which sounds like the problem is here.
I think Erik's java.net article about
Because '_' wa sprobably removed from your input before it was indexed.
I suggest reading up on Analyzers and Tokenizers.
Otis
--- Pleasant, Tracy [EMAIL PROTECTED] wrote:
Also searching 'red_*' returns nothing, also.
-Original Message-
From: Dror Matalon [mailto:[EMAIL
Yes.
For this particular example, PorterStemFilter will do the job.
For more complex things (e.g. a search for car returning car, auto,
automobile, vehicle) you'll need to add thesaurus-like capability to
your indexer. This can be done by writing a custom Analyzer.
It sounds like you have a lot
There are several Lucene highlighting solutions for Lucene out there.
I know of two that include source code, and I think at least one of
them is on the Contributions page.
There are some threads that talk about this issue, too.
Otis
--- Pleasant, Tracy [EMAIL PROTECTED] wrote:
I have seen that
Correct.
As for side-effect, well, things will be slower, obviously :)
Increase the limit, perform a search, and see if it's still
sufficiently fast...that's what I would do. :)
Otis
--- Dror Matalon [EMAIL PROTECTED] wrote:
This was raised in
http://www.mail-archive.com/[EMAIL
Hm, I don't know of any such tools. It would be nice to have something
like that. If you find such a tool, or write it yourself, let us know
about the URL, so we can include on the Lucene's site.
Otis
--- Dror Matalon [EMAIL PROTECTED] wrote:
Hi,
I looked around the archives and didn't see
Maybe this will help?
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=23545
Otis
--- Tun Lin [EMAIL PROTECTED] wrote:
Hi,
May I know how do I analyse Chinese input from Chinese text in
Lucene?
Do I use Analyser function in Lucene? If yes, how to go about using
it?
--- Dror Matalon [EMAIL PROTECTED] wrote:
So, the lock is set, the segments file is opened, all the files in
the
segments file are opened and then the lock is released? Is that
correct?
Yes.
See IndexReader.
And we're relying on the OS to keep the file handles around even if
the
files
Could you add a Lucene logo somewhere on your search results, as noted
here:
http://jakarta.apache.org/lucene/docs/powered.html ?
Thanks!
Otis
--- Ulrich Mayring [EMAIL PROTECTED] wrote:
Hello,
we (DENIC) are the world's second largest domain registry (.de-zone
has
almost 6.9 million
Ok, let us know if you can add it.
Otis
--- Ulrich Mayring [EMAIL PROTECTED] wrote:
Otis Gospodnetic wrote:
Could you add a Lucene logo somewhere on your search results, as
noted
here:
http://jakarta.apache.org/lucene/docs/powered.html ?
Will suggest that to the powers
Uh, I get to do this dirty job. :(
Lucene-user and lucene-dev are not the appropriate fora for questions
such as this one.
Please ask the original author of the text for help, or use an online
translation service, such as the one at http://babelfish.av.com
Also, for questions about Lucene usage,
You should ask Spindle author(s). The error doesn't look like
something that is related to Lucene, really.
Otis
--- Zhou, Oliver [EMAIL PROTECTED] wrote:
What about Spindle? Has anybody used it to crawle a jsp based web
site? Do I
need to intall listlib.jar to do so?
I got error message
Maybe
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexWriter.html#maxFieldLength
?
Otis
--- Aaron Galea [EMAIL PROTECTED] wrote:
Hi
I am indexing a document but for a strange reason the word Mayo is
never indexed. The thing is that in this large document this term
I am against making the suggested Lucene modification.
Lucene index structure may change in the future. It is possible that
one day Lucene developers will need to use a hierarchy of directories
to implement some feature.
Therefore, Lucene users should be discouraged from creating
sub-directories
Stefan, which patch are you referring to?
I looked at the following, but did not find it:
I think this never resulted in a patch. A few days after that thread
another person expressed interest in implementing the same thing, but I
am not sure what the status of that idea is now.
Otis
--- Stefan Groschupf [EMAIL PROTECTED] wrote:
Otis,
based on this discussion:
--- Stefan Groschupf [EMAIL PROTECTED] wrote:
Just to be sure since there was a lot of dicussion in the lists.
There is actually no solution available to get a term vector for a
document or a TF/IDF feature vector for a document, isn't it?
Correct :(
Some one had work on such things?
Nice.
Please send the cvs diff, as I mentioned in that thread where you sent
inlined diffs.
Thanks,
Otis
--- Damian Gajda [EMAIL PROTECTED] wrote:
BTW. i may send You the partly working Lucene with Dmitrys code
patched
in.
--
Damian
I don't think there was a follow-up to this.
Aaron, please provide a listing of the directory that you are using in
IndexWriter constructor. Is it empty? What are permissions on it?
When the exception occurs, a file called write.lock should remain in
the directory. Can you ls -al that file?
I don't fully understand what you mean by increasing the maximum string
size. Are you referring to the length of terms in the field, so now
your field can contain terms whose text/string value can have the
size/length of 10,000 bytes?
If that is so, I believe there is an internal (to Lucene)
Please try. I find this (the originally described problem) hard to
believe. :)
Otis
--- Chong, Herb [EMAIL PROTECTED] wrote:
i would like to, but the documents contain confidential content. i
don't know if i can reproduce the problem with another set of
documents.
Herb
Tun Lin,
WebLucene is a different project, so you should really use its mailing
lists. I doubt subscribers to the Lucene mailing list will be able to
help you as much as the WebLucene author and other WebLucene users.
Otis
--- Tun Lin [EMAIL PROTECTED] wrote:
Hi,
When I downloaded the web
You can use raw *Query classes and OR, perhaps.
Or, if you are using QueryParser, there is a MultiFieldQueryParser (or
something like that) classwhich I've used awhile ago.
Otis
--- Thijs Cadier [EMAIL PROTECTED] wrote:
I'm implementing Lucene in our Content Management system. A plugin
for
Lucene writes locks to some directories (java.io.temp system property),
so make sure you can write to those.
Otis
--- Alex Gadea [EMAIL PROTECTED] wrote:
I am trying to setup a Lucene installation on a Windows 2000 server.
I can not get the IndexWriter to initialize properly. It fails out
You could subclass IndexSearcher like you said. You could patch
IndexSearcher by adding this method. You could also change the search
method to do this check automatically. You could also add
setAutoRefresh(boolean) method to IndexSearcher and then to automatic
refresh only if this was set to
I suggest you look at the Articles section of Lucene's site, in
particular an article about XML, Lucene, and Digester. Much better
than using IndexHTML demo, I believe.
Otis
--- Thomas_Krämer [EMAIL PROTECTED] wrote:
Hello Lucene Users
i use Lucene 1.3rc3 to index several thousand metadata
This is a question for lucene-user list...redirecting.
Looks okay, except it doesn't look like real code. Also, you are
catching Exception and only logging it. Maybe that exception hides the
source of the problem.
Otis
--- [EMAIL PROTECTED] wrote:
Greetings,
I upgraded from lucene-1.2.jar
--- Scott Smith [EMAIL PROTECTED] wrote:
I have an application that is reading in XML files and indexing them.
Each
XML file is 3K-6K bytes. This application preloads a database that I
will
add to on the fly later. However, all I want it to do initially is
take
some existing files and
I think this is a FAQ.
Keep that single IndexSearcher until you change the index and want that
IS to see those changes.
Otis
--- Karl Koch [EMAIL PROTECTED] wrote:
Hi all,
I have a search method who is used by many programs with different
queries.
I therefore do not want to close the
Look at the IndexWriter Javadocs. One of the fields allows you to set
maximum term length. This may also be a problem with the HTML parser
you are using. You didn't share a lot of details, so I cannot help
more.
Otis
--- Syrén_Per [EMAIL PROTECTED] wrote:
Hi all,
Have a question
Hello Morus,
--- Morus Walter [EMAIL PROTECTED] wrote:
Hi,
I'm currently trying to get rid of query parser problems with
stopwords
(depending on the query, there are ArrayIndexOutOfBoundsExceptions,
e.g. for stop AND nonstop where stop is a stopword and nonstop not).
While this isn't
Karl:
http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]msgId=114748
Status: several people have mentioned they wanted to work on it, but
nobody has contributed any patches. The code you see at the above URL
is not compatible with Lucene 1.3, but could be brought up to date.
Otis
--- Karl
Redirecting to lucene-user
--- Jim Hargrave [EMAIL PROTECTED] wrote:
Can anyone tell me why these two queries would produce different
results:
+A -B
A -(-B)
A and +A are not the same thing when you have multiple terms in a
query.
Also, we are having a hard time understanding why
Use RAMDirectory and then user mergeIndexes(Directory[]) method.
Otis
--- Chong, Herb [EMAIL PROTECTED] wrote:
does anyone have an example of using RAMDirectory during indexing and
then copying the index into a FSDirectory?
Herb...
You can add multiple String values to a single Field.
I don't remember the API to provide an example here, but you should be
able to find this in the Javadoc. Maybe even in FAQ, not sure.
Otis
--- Gabe [EMAIL PROTECTED] wrote:
If I have a group of documents and I want to filter on
a
Hello,
This is not a known problem.
The mention of Cocoon makes me think XML.
What format are your documents in?
If they are in XML, the first place to look for performance-related
problems is the XML parser. It looks like you got a new version of
Cocoon, so maybe this new version includes a
I think that's the only one we've got.
You can browse the Lucene Sandbox contributions directory, it's there.
Otis
--- Weir, Michael [EMAIL PROTECTED] wrote:
Is the CJKAnalyzer the best to use for Japanese? If not, which is?
If so,
from where can I download it?
Thanks.
Michael Weir .
Otis Gospodnetic
--- Boris Goldowsky [EMAIL PROTECTED] wrote:
Strangely, the web site does not seem to list any vendors who provide
incident support for Lucene. That can't be right, can it?
Can anyone point me to organizations that would be willing to provide
support for Lucene issues
Not as far as I know.
Otis
--- Stefan Groschupf [EMAIL PROTECTED] wrote:
Am 30.01.2004 um 22:11 schrieb Stefan Groschupf:
JBoss Group http://jboss.org/
Does jboss really support maven?
Sorry, doing 2 things at the same time is not good.
Should be: Does jboss really
Use JCDB to connect to your DB, issue appropriate SELECTs to select
each of you entity/document units, then use the returned data to create
instances of Lucene documents, add those to the index via IndexWriter,
and you got yourself a Lucene index that represents data you have
stored in DB.
If
The best way to submit contributions is via Bugzilla.
For instance, here is the current queue of contributed code, patches,
etc.:
There is score.
Look at Similarity class.
Otis
--- [EMAIL PROTECTED] wrote:
Hi!
Is there a hit quality rating in Lucene or are there only hits and
non-hits?
Timo
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For
I believe it would be something like Message-ID or
--- Caroline Jen [EMAIL PROTECTED] wrote:
I am trying to build message inboxes for all
registered members of a web site. Therefore, each
thread (i.e. under a certain discussion topic) can
have several postings. And each registered member's
I believe it would be the value of a 'Message-ID' or 'Reference' or
'Reference-ID' message header.
However, I remember reading that mail readers are not very good at
sticking to a standard (some RFC, I guess), so they don't always
provide the corrent ID, or they store it under non-standard names,
Good news, I was looking forward to the Perl port.
I added it to the list of Lucene ports on Lucene site.
Otis
--- Tony Bowden [EMAIL PROTECTED] wrote:
Plucene 1.0 has just been released to CPAN, and is available at
http://search.cpan.org/dist/Plucene/
This is a port of Lucene to Perl,
A problem with Gump, I believe.
Otis
--- Eric Jain [EMAIL PROTECTED] wrote:
How come there is no nightly snapshot newer than 2003-09-09 at
http://cvs.apache.org/builds/jakarta-lucene/nightly/?
-
To unsubscribe, e-mail:
I would first look at the exact command line that is used to start the
app server. Could it be that includes something like
-Djava.io.temp=some-directory-here ?
Lucene uses java.io.temp System property to determine the
location/directory to use for lock files. Maybe this app server uses
some
You should probably always try to use Directory, and not String nor
FSDirectory.
Directory is the most abstract 'index type and location entity', and
using it smartly allows you to change your index type and location more
easily, should you ever choose to do that.
Otis
--- Scott Smith [EMAIL
Without seeing more information/code, I can't tell which part of your
system slows down with time, but I can tell you that Lucene's 'add'
does not slow over time (i.e. as the index gets larger). Therefore, I
would look elsewhere for causes of the slowdown.
The easiest thing to do is add logging
There were some recent contributions that should make this possible and
simple to do.
The code should be added to Lucene CVS repository in the next week or
so.
Otis
--- Gabe [EMAIL PROTECTED] wrote:
Hi,
I was wondering whether it was possible to sort search
results by the order of the
--- Leo Galambos [EMAIL PROTECTED] wrote:
Otis Gospodnetic napsal(a):
Without seeing more information/code, I can't tell which part of
your
system slows down with time, but I can tell you that Lucene's 'add'
does not slow over time (i.e. as the index gets larger). Therefore,
I
would
--- Leo Galambos [EMAIL PROTECTED] wrote:
Otis Gospodnetic napsal(a):
Thus I do not know how it could be O(1).
~ O(1) is what I have observed through experiments with indexing of
several million documents.
What did you exactly measured? Just the time of the insert
will the names of the relevant files be and will
I be able to use 1.3 final still (simply integrating
the contributions into my own code) or would I have to
go with the latest code from CVS?
Thanks again,
Gabe
--- Otis Gospodnetic [EMAIL PROTECTED]
wrote:
There were some recent contributions
Update in Lucene means: delete the document and then re-add it.
This may be a FAQ.
Otis
--- Markus Brosch [EMAIL PROTECTED] wrote:
However, I have problems with reindexing.
First, I index all my object contents. Then some of these objects
can
change
and need to be re-indexed.
I
If there are commit.lock files being left over, you should really
investigate why that is happening. Something is probaly dying, and you
are not catching it and cleaning up by closing things like IndexReader
or IndexWriter.
If you want to forcefully unlock the index, use isLocked and unlock
, this will be the most comprehensive and up to date
documentation about Lucene.
Otis Gospodnetic
--- Nicolas Maisonneuve [EMAIL PROTECTED] wrote:
hy,
it would be great if a page with all features of lucene would be
created in the apache lucene site !
in the sourceforge website
(http
Lots of params in that mlt method, but it seems flexible.
I'll try it.
Small optimization suggestion: use int[] with a single element for that
words Map, instead of creating lots of Integer()s. Actually, maybe
JVMs are smart and don't allocate new objects for the same int wrapped
in Integer
Searches ARE case sensitive, it is just that some Analyzers lowercase
all tokens. If you are using WhitespaceAnalyzer, then tokens will not
be lowercased, so a search for albert and Albert may yield different
results.
Otis
--- [EMAIL PROTECTED] wrote:
On Monday 16 February 2004 19:20, [EMAIL
Custom? :)
Otis
--- [EMAIL PROTECTED] wrote:
On Monday 16 February 2004 19:57, Otis Gospodnetic wrote:
Searches ARE case sensitive, it is just that some Analyzers
lowercase
all tokens. If you are using WhitespaceAnalyzer, then tokens will
not
GermanAnalyzer apparently is one of them
Timo, by the nature of your questions it seems like you didn't see the
Articles section of Lucene's site. There are links to several articles
there. A few of them explain indexing (intro + more advanced), at
least one explains QueryParser and maybe Analyzer, and a few explain
vanilla searching.
Ive just got a couple of questions which i cant quite work
out...wondered if
someone could help me with them:
1. What happens if i make a backup (copy) of an index while documents
are
being added? Can it cause problems, and if so is there a way to
safely do
this?
You should be okay.
301 - 400 of 718 matches
Mail list logo