I am interested in pursuing experienced peoples' understanding as I have half
the queue approach developed already.
I am not following why you don't like the queue approach Sergiu. From what I
gathered from this board, if you do lots of updates, the opening of the
WriterIndex is very
[EMAIL PROTECTED] wrote:
I am interested in pursuing experienced peoples' understanding as I have half the queue approach developed already.
well I think that experienced people developed lucene :) theyoffered us
the possibility to use multithreading and concurent searching.
Of course ..
Hello All,
Is it possible to index the documents in incremental fashion. What I
mean by this is, update the document in the index only if it has changed
since last time it was indexed. This can save considerable amount of
time while indexing.
Any pointers are appreciated.
-H
Take a peak at IndexHTML.java in the demo that ships with Lucene. It
performs an incremental update as you have described.
- Original Message -
From: Hetan Shah [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Monday, November 15, 2004 4:45 PM
Subject: Incremental Indexing
This is how I implemented incremental indexing. If anyone sees anything
wrong, please let me know.
Our motivation is similar to John Eichel's. We have a digital asset
management system and when users update, delete or create a new asset,
they need to see their results immediately.
The most
then the whole
database. I think I need to look in the developer archives.
JohnE
- Original Message -
From: Luke Shannon [EMAIL PROTECTED]
Date: Monday, November 15, 2004 5:14 pm
Subject: Re: Lucene : avoiding locking (incremental indexing)
Hi Luke;
I have a similar system (except
[EMAIL PROTECTED]
Sent: Monday, November 15, 2004 5:50 PM
Subject: Re: Lucene : avoiding locking (incremental indexing)
It really seems like I am not the only person having this issue.
So far I am seeing 2 solutions and honestly I don't love either totally.
I am thinking that without changes
, November 24, 2003 10:07 PM
To: 'Lucene Users List'
Subject: RE: Lucene refresh index function (incremental indexing).
Does it support indexing the contents of pdf files? I have found one project
called PDFBox that can be integrated with Lucene to search inside of the pdf
files. Currently, Lucene can
]
Sent: Monday, November 24, 2003 11:07 PM
To: 'Lucene Users List'
Subject: RE: Lucene refresh index function (incremental indexing).
Does it support indexing the contents of pdf files? I have found one
project
called PDFBox that can be integrated with Lucene to search inside of the
pdf
files
Message-
From: Tun Lin [mailto:[EMAIL PROTECTED]
Sent: Monday, November 24, 2003 10:07 PM
To: 'Lucene Users List'
Subject: RE: Lucene refresh index function (incremental indexing).
Does it support indexing the contents of pdf files? I have found one project
called PDFBox that can
,
Oliver
-Original Message-
From: Ben Litchfield [mailto:[EMAIL PROTECTED]
Sent: Tuesday, November 25, 2003 9:45 AM
To: Lucene Users List
Subject: RE: Lucene refresh index function (incremental indexing).
Yes, just add the log4j configuration. The easiest way to do
index function (incremental indexing).
I do have other problems with PDFBox-0.6.4. For one, it has annoying debug
information at very low level parsing process. The other, I got infinite
loop while indexing pdf files although they say the infinite loop bug has
been fixed in their release notes
Message-
From: Ben Litchfield [mailto:[EMAIL PROTECTED]
Sent: Tuesday, November 25, 2003 9:45 AM
To: Lucene Users List
Subject: RE: Lucene refresh index function (incremental indexing).
Yes, just add the log4j configuration. The easiest way to do that is as a
system parameter like
To: Lucene Users List; [EMAIL PROTECTED]
Subject: RE: Lucene refresh index function (incremental indexing).
I was able to get PDFBox to work with my JSP webpages.
I think you will have to in a way write your own code to do the PDF files (while
still calling the Lucene functions)
doc
the
indexes on its own?
-Original Message-
From: Victor Hadianto [mailto:[EMAIL PROTECTED]
Sent: Monday, November 24, 2003 1:07 PM
To: Lucene Users List
Subject: Re: Lucene refresh index function (incremental indexing).
Ah .. ic,
But you don't need to do that even if you can do it. Lucene
that monitor the directory
for a new document added. This application then will just index that new
document withouth reindexing the entire document set.
If you do incremental indexing, the indexing does take longer as the
document base grows, but you shouldn't really have this problem until your
Tun Lin wrote:
These are the steps I took:
1) I compile all the files in a particular directory using the command:
java org.apache.lucene.demo.IndexHTML -create -index c:\\index ..
, putting all the indexed files in c:\\index.
2) Everytime, I added an additional file in that directory. I need
Will the final version 1.3 include an application that does the incremental
updates automatically?
-Original Message-
From: Doug Cutting [mailto:[EMAIL PROTECTED]
Sent: Tuesday, November 25, 2003 5:01 AM
To: Lucene Users List
Subject: Re: Lucene refresh index function (incremental
]
Sent: Tuesday, November 25, 2003 5:01 AM
To: Lucene Users List
Subject: Re: Lucene refresh index function (incremental indexing).
Tun Lin wrote:
These are the steps I took:
1) I compile all the files in a particular directory using the command:
java org.apache.lucene.demo.IndexHTML -create
I delete the old ones and add them again manually. But how do I reindex
the
documents automatically without doing it manually?
You don't need to reindex the documents again. Lucene does incremental
indexing. Just add your document to the index and that's it. You need to
create a new
in that particular directory?
Please advise.
-Original Message-
From: Victor Hadianto [mailto:[EMAIL PROTECTED]
Sent: Monday, November 24, 2003 12:36 PM
To: Lucene Users List
Subject: Re: Lucene refresh index function (incremental indexing).
I delete the old ones and add them again manually
Ah .. ic,
But you don't need to do that even if you can do it. Lucene does incremental
indexing. So you would create a new program to add your document manually
using IndexWriter, not blatting the index and doing it again.
Seems like you just trying out Lucene, I suggest having a look
Hi,
It's not clear what you mean when you say refresh indexes or
re-compiling. If you're adding new documents just use the add()
method. If you are replacing documents, you need to first delete the
old ones and then add them again. Look at the mailing list archive for
this, since it's been
Moved to lucene-user (what's going on today?).
Business.com...what was the amount again? 7 million USD?
You can keep IndexSearcher open while adding/reoptimizing, but if you
want to see the changes you need to close/discard the IndexSearcher
instance and get a new one.
Otis
--- Oshima, Scott
1. Open reader;
2. Delete all old documents;
3. Close reader;
4. Open writer;
5. Add all new documents;
6. Close writer.
If, before step one, you open another IndexReader, then you can continue
to use it for searches while the update is in progress. If you then,
Hello,
Currently, I use the following procedure to update an index
incrementally:
1. Build document
2. Open index reader
3. Delete any previous version of the document using a key field
4. Close index reader
5. Open index writer
6. Add document to index
7.
If you want to update a set of documents, you can remove their previous
version first and then add them after that. In the mean time documents
of this set are temporaly not available. If you have to update a single
document and make the changes immediately public, I don't know a better
+1. Support for transactions in Lucene are high on my list of desirable
features as well. I would love to have time to look into adding this,
but lately... well, you know how that goes.
Scott
Eric Jain wrote:
If you want to update a set of documents, you can remove their previous
version
Hi,
Is it possible to do incremental indexing with the IndexHTML? And how do you
do it?
Thx..
--
To unsubscribe, e-mail: mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]
29 matches
Mail list logo