you can use lucene to index whatever you want. You have to create your own 'crawler'(I think some have been contributed). It is very flexible in that your indexed data does NOT have to be(but can be) web pages.
http://jakarta.apache.org/lucene Charlie > -----Original Message----- > From: Felipe Schnack [mailto:felipes@;ritterdosreis.br] > Sent: Thursday, October 31, 2002 5:30 AM > To: Tomcat Users List > Subject: RE: Search engines and MVC--to clarify > > > I will need to do page search in a website i'm developing right now > Can I use htdig for this? > I basically need a simple search engine, no many bells and whistles, > but there must be an option to exclude some paths/file types from the > indexing... > > On Thu, 2002-10-31 at 06:25, Ralph Einfeldt wrote: > > It depends on the search engine that you use. > > > > The search engine has two parts: > > - indexer > > There are two general solutions for the indexer > > - web crawler > > This kind of indexer is completely independent > > of the internal architecture. It's just following > > the links in the pages. If it's a good indexer > > like htdig, it is not ignoring the query string. > > Otherwise you have to build your site in a way > > that it doesn't use query strings. (But that > > that has nothing to do with MVC, that's true > > for any dynamic site) > > - internal > > This kind of indexer is typically built in your > > own code and is used primarily to build searches > > for specific contents in the site. (E.G.: Product search) > > This kind of indexing has to fit your architecture. > > - query engine > > There are for typical solutions for this > > - standalone > > In this case the search engine contains a component > > that presents the result to the user. > > - integrated > > In this case you get an API that you can use in your > > own code to present the result to the user. > > - integrated+standalone > > Here you use the standalone solution to get the > > results internally and use your own code to present > > the result to the user. > > - internal > > This is the counterpart for the internal indexer. > > > > Ralph Einfeldt > > Uptime Internet Solution Center GmbH > > Hamburg, Germany > > Hosting, Content Management, Java Consulting > > http://www.uptime-isc.de > > > > > -----Original Message----- > > > From: Michele Emmi [mailto:micheleemmi@;hotmail.com] > > > Sent: Thursday, October 31, 2002 12:13 AM > > > To: [EMAIL PROTECTED] > > > Subject: Re: Search engines and MVC--to clarify > > > > > > To clarify...I have 2 websites built on the mvc architecture, > > > I would like to have them indexed...does anyone have any > > > experience in this... > > > > -- > > To unsubscribe, e-mail: > <mailto:tomcat-user-unsubscribe@;jakarta.apache.org> > > For additional commands, e-mail: > <mailto:tomcat-user-help@;jakarta.apache.org> > > > -- > > Felipe Schnack > Analista de Sistemas > [EMAIL PROTECTED] > Cel.: (51)91287530 > Linux Counter #281893 > > Faculdade Ritter dos Reis > www.ritterdosreis.br > [EMAIL PROTECTED] > Fone/Fax.: (51)32303328 > > > -- > To unsubscribe, e-mail: <mailto:tomcat-user-unsubscribe@;jakarta.apache.org> For additional commands, e-mail: <mailto:tomcat-user-help@;jakarta.apache.org> -- To unsubscribe, e-mail: <mailto:tomcat-user-unsubscribe@;jakarta.apache.org> For additional commands, e-mail: <mailto:tomcat-user-help@;jakarta.apache.org>
