Since you want the rendered JSP pages, I would recommend using a crawler to get the files and then index the resulting file with Lucene. The jsp vs. html file extension doesn't matter. You can index database content, xml files, html files, or any text data you want. However, you have to use or create a way to get that text data into a Lucene Document.
An integrated Lucene Crawler is LARM in the Lucene Sandbox area. I haven't used it yet, but other have and it sounds great. I hope this helps --Peter On Tuesday, October 15, 2002, at 11:37 PM, Richard Doorduin wrote: > > Thanks Peter, > > I know that jsp is a dynamic page, that is the problem. > but a jsp also contains static information, usualy this is html. > If index html files, it also creates some keywords to the link. > is that possible for jsp files to? > > and should i write my own java application to index jsp, > or is it also possible to use the current application? > > thanks for helping, > > Richard > > > -----Oorspronkelijk bericht----- > Van: Peter Carlson [mailto:[EMAIL PROTECTED]] > Verzonden: woensdag 16 oktober 2002 8:06 > Aan: Lucene Users List > Onderwerp: Re: Indexing options > > > Lucene will index anything you ask it to index. > > In order to index content, you have to create a Lucene Document object. > Lucene Document object is made up of Field's. Each field contains data. > > The html indexing example parses the html text and create appropriate > Lucene Documents. If you have other content you want to index, just > follow the same example. > > You question is a little strange since you want Lucene to index a > dynamic page or are you asking to index the raw jsp file? > > I hope this helps a little. > > --Peter > On Monday, October 14, 2002, at 11:53 PM, Richard Doorduin wrote: > >>> Dear [EMAIL PROTECTED], >>> >>> i'm looking forward to see come out the new version of lucene. >>> it really is a handy tool. >>> >>> But the reason i wrote this e-mail is because i couldnt figure >>> something >>> out. >>> at the moment lucene is only indexating .html sites. >>> But wat "we" tomcat users also like to index are *.jsp files. >>> >>> i couldnt figure out where to define these extensions. >>> or is it not yet possible to do that? >>> >>> thanks for listening! >>> >>> Greetings, >>> >>> >>> Richard Doorduin. >> >> -- >> To unsubscribe, e-mail: >> <mailto:[EMAIL PROTECTED]> >> For additional commands, e-mail: >> <mailto:[EMAIL PROTECTED]> >> >> > > > -- > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > -- > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
