Re: Does lucene support distributed indexing?

2008-04-27 Thread Samuel Guo
Thanks a lot :) 2008/4/26 Grant Ingersoll <[EMAIL PROTECTED]>: > > On Apr 26, 2008, at 2:33 AM, Samuel Guo wrote: > > Hi all, > > > > I am a lucene newbie:) > > > > It seems that lucene doesn't support distributed indexing:( > > As some IR research papers mentioned, when the documents collection

Re: Does lucene support distributed indexing?

2008-04-27 Thread Otis Gospodnetic
There are actually several distributed indexing or searching projects in Lucene (the top-level ASF Lucene project, not Lucene Java), and it's time to start thinking about the possibility of bringing them together, finding commonalities, etc. Here is the summary: - Lucene - distributed search vi

Does Lucene save an offline version of web pages?

2008-04-27 Thread Legolas wood
Hi Thank you for reading my post. I have to design a system with the following requirements, I think Lucene or one of the projects which are based on Lucene can help me as a base to continue on. Here is the requirements: - Fetch and index some pages (containing word and pdf documents) on daily bas

TrecDocMaker

2008-04-27 Thread DanaWhite
Greetings, I am trying to use TrecDocMaker so I can successfully index and evaluate lucene on a TReC collection. It seems like I would just repeatedly call makeDocument() until all the Documents have been created, but makeDocument appears to just read forever. In general TrecDocMaker seems like

Re: Does Lucene save an offline version of web pages?

2008-04-27 Thread Lukas Vlcek
Hi, this sounds like job for Nutch (one of Lucene family projects). On Sun, Apr 27, 2008 at 8:26 PM, Legolas wood <[EMAIL PROTECTED]> wrote: > Hi > Thank you for reading my post. > I have to design a system with the following requirements, I think > Lucene or one of the projects which are based

Re: Does Lucene save an offline version of web pages?

2008-04-27 Thread Bill Janssen
> - Fetch and index some pages (containing word and pdf documents) on > daily basis. > - Extract all pages that contain some provided keywords after fetching > the pages. > - Create some bulletin from fetched pages, bulletin will be in pdf > format and are categorized based on keywords. > - provide

How to Uniquely Identify Documents in a Lucene Index

2008-04-27 Thread Hasan Diwan
I'm working on a JSP-based, free-form text storage & retrieval system based on lucene. Part of my desired feature set includes the ability to retrieve, edit, and update text comprising the document. The user flow involves: A search for a document, whose "all" field is then retrieved, then it can be