Hi Jonathan,
Yet another burning question :-). Can someone explain how the document
numbers in Lucene documents work? For example, the TermDocs.doc()
method returns the current doc number. How can I get this doc number
if I just have a Document?
I don't think you can.
A document does
Saad,
Here is what I got. I will post again, and be more
specific.
-Y
--- Nader Henein [EMAIL PROTECTED] wrote:
We'll need a little more detail to help you, what
are the sizes of your
updates and how often are they updated.
1) No just re-open the index writer every time to
re-index,
Hi.
Coming up with answers... a little belated, but hope you're still on:
we have been experimenting with carrot2 and are very pleased so far,
only one issue: there is no release not even an alpha one and the
dependencies seemed to be patched (jama)
Yes, there is not official release. We just
I am pleased to announce that MindRetrieve 0.4.0 has been released.
MindRetrieve is a desktop search tool to help users to search and organize
the web they have seen. Download it from http://mindretrieve.berlios.de/.
Everyday we read a large amount of information from the world wide web.
The
David, Hi,
Would you be able to comment on coincidentally recent thread RE: -
Grouping Search Results by Clustering Snippets:?
Also, when I looked at Carrot2 the pipe line is implemented as over http. I
wonder how efficient that is, or can it be changed, for instance for an all
local
Yura Smolsky wrote:
There is a big difference when you use compound index format or
multiple files. I have tested it on the big index (45 Gb). When I used
compound file then optimize takes 3 times more space, b/c *.cfs needs
to be unpacked.
Now I do use non compound file format. It needs like
Adam,
Dawid posted some code that lets you use Carrot2 locally with Lucene,
without the componentized pipe line system described on Carrot2 site.
Otis
--- Adam Saltiel [EMAIL PROTECTED] wrote:
David, Hi,
Would you be able to comment on coincidentally recent thread RE: -
Grouping Search
Otis Gospodnetic wrote:
Adam,
Dawid posted some code that lets you use Carrot2 locally with Lucene,
see embedded zip url here for carrot2/lucene code - it may also be in
the carrot2 cvs tree too - this is what I used in the wikipedia/cluster
stuff as the basis
Hi all,
I ve a kind of problem to execute a converting tool to modify a pdf to an
html under Linux. In fact, i have an executable pdftohtml which work
correctly on batch mode, and when I want to use it through Java under
Windows 2000 works also,BUT it does not work at all on the server under
I will assume you are asking this question on the lucene mailing list
because you now want to index that PDF document.
Have you tried PDFBox? It can't create an html file for you but it can
extract text.
Ben
http://www.pdfbox.org
On Mon, 31 Jan 2005, Bertrand VENZAL wrote:
Hi all,
I ve
Check out http://www.javaworld.com/javaworld/jw-12-2000/jw-1229-traps.html
which provides some pointers and code which should be helpful.
Cheers,
Kelvin
http://www.supermind.org
On Mon, 31 Jan 2005 19:01:11 +0100, Bertrand VENZAL wrote:
Hi all,
I ve a kind of problem to execute a converting
OK, thanks.
Adam
-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Monday, January 31, 2005 5:51 PM
To: Lucene Users List; [EMAIL PROTECTED]
Subject: RE: carrot2 question too - Re: Fun with the Wikipedia
Adam,
Dawid posted some code that lets you use
12 matches
Mail list logo