Re: document numbers

2005-01-31 Thread Morus Walter
Hi Jonathan, Yet another burning question :-). Can someone explain how the document numbers in Lucene documents work? For example, the TermDocs.doc() method returns the current doc number. How can I get this doc number if I just have a Document? I don't think you can. A document does

Re: Re-Indexing a moving target???

2005-01-31 Thread Yousef Ourabi
Saad, Here is what I got. I will post again, and be more specific. -Y --- Nader Henein [EMAIL PROTECTED] wrote: We'll need a little more detail to help you, what are the sizes of your updates and how often are they updated. 1) No just re-open the index writer every time to re-index,

Re: carrot2 question too - Re: Fun with the Wikipedia

2005-01-31 Thread Dawid Weiss
Hi. Coming up with answers... a little belated, but hope you're still on: we have been experimenting with carrot2 and are very pleased so far, only one issue: there is no release not even an alpha one and the dependencies seemed to be patched (jama) Yes, there is not official release. We just

ANNOUNCE: MindRetrieve 0.4 - Search the web you have seen

2005-01-31 Thread aurora
I am pleased to announce that MindRetrieve 0.4.0 has been released. MindRetrieve is a desktop search tool to help users to search and organize the web they have seen. Download it from http://mindretrieve.berlios.de/. Everyday we read a large amount of information from the world wide web. The

RE: carrot2 question too - Re: Fun with the Wikipedia

2005-01-31 Thread Adam Saltiel
David, Hi, Would you be able to comment on coincidentally recent thread RE: - Grouping Search Results by Clustering Snippets:? Also, when I looked at Carrot2 the pipe line is implemented as over http. I wonder how efficient that is, or can it be changed, for instance for an all local

Re: Disk space used by optimize

2005-01-31 Thread Doug Cutting
Yura Smolsky wrote: There is a big difference when you use compound index format or multiple files. I have tested it on the big index (45 Gb). When I used compound file then optimize takes 3 times more space, b/c *.cfs needs to be unpacked. Now I do use non compound file format. It needs like

RE: carrot2 question too - Re: Fun with the Wikipedia

2005-01-31 Thread Otis Gospodnetic
Adam, Dawid posted some code that lets you use Carrot2 locally with Lucene, without the componentized pipe line system described on Carrot2 site. Otis --- Adam Saltiel [EMAIL PROTECTED] wrote: David, Hi, Would you be able to comment on coincidentally recent thread RE: - Grouping Search

Re: carrot2 question too - Re: Fun with the Wikipedia

2005-01-31 Thread David Spencer
Otis Gospodnetic wrote: Adam, Dawid posted some code that lets you use Carrot2 locally with Lucene, see embedded zip url here for carrot2/lucene code - it may also be in the carrot2 cvs tree too - this is what I used in the wikipedia/cluster stuff as the basis

Use an executable from java ...

2005-01-31 Thread Bertrand VENZAL
Hi all, I ve a kind of problem to execute a converting tool to modify a pdf to an html under Linux. In fact, i have an executable pdftohtml which work correctly on batch mode, and when I want to use it through Java under Windows 2000 works also,BUT it does not work at all on the server under

Re: Use an executable from java ...

2005-01-31 Thread Ben Litchfield
I will assume you are asking this question on the lucene mailing list because you now want to index that PDF document. Have you tried PDFBox? It can't create an html file for you but it can extract text. Ben http://www.pdfbox.org On Mon, 31 Jan 2005, Bertrand VENZAL wrote: Hi all, I ve

Re: Use an executable from java ...

2005-01-31 Thread Kelvin Tan
Check out http://www.javaworld.com/javaworld/jw-12-2000/jw-1229-traps.html which provides some pointers and code which should be helpful. Cheers, Kelvin http://www.supermind.org On Mon, 31 Jan 2005 19:01:11 +0100, Bertrand VENZAL wrote:  Hi all,  I ve a kind of problem to execute a converting

RE: carrot2 question too - Re: Fun with the Wikipedia

2005-01-31 Thread Adam Saltiel
OK, thanks. Adam -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Monday, January 31, 2005 5:51 PM To: Lucene Users List; [EMAIL PROTECTED] Subject: RE: carrot2 question too - Re: Fun with the Wikipedia Adam, Dawid posted some code that lets you use