Hi Paul, Thanks. Use Nutch to do crawling. and integrate Lucene to the web application, so that can do search online.
BTW, Nutch seems to have only Linux version, what my development is on Windows. Am i right? Zhou --- On Fri, 8/1/10, Paul Libbrecht <[email protected]> wrote: From: Paul Libbrecht <[email protected]> Subject: Re: a complete solution for building a website search with lucene To: [email protected] Date: Friday, 8 January, 2010, 4:27 PM Zhou, Lucene is a back-end library, it's very useful for developer but it is not a complete site-search-engine. A lucene-based site-search-engine is Nutch, it does crawl. Solr also provides functions close to these with a large amount of thoughts on flexible integration; crawling methods are rather based on feeds or other acquisition methods (see DIH for example). paul Le 08-janv.-10 à 08:08, <[email protected]> a écrit : > Hi , > > I am new in Lucene. > > To build a web search function, it need to have a backendc indexing function. > But, before that, should run a Crawler? because Lucene index based on Html > documents, while Crawler can change the website pages to Html documents. Am i > right? > > If so, please anyone suggest to me a Crawler? like Nutch? > Thanks > Zhou > > > > > New Email names for you! > Get the Email name you've always wanted on the new @ymail and @rocketmail. > Hurry before someone else does! > http://mail.promotions.yahoo.com/newdomains/sg/ --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected] New Email names for you! Get the Email name you've always wanted on the new @ymail and @rocketmail. Hurry before someone else does! http://mail.promotions.yahoo.com/newdomains/sg/
