Hi, this is a great tool to retrieve and scrape html pages (rendered or not)...
http://www.research.compaq.com/SRC/WebL/ :-) Chris Opler w i l l i a m__b o y d wrote: > > If they're mostly static, why not just code a little crawler to > > request the pages via the web-server and parse the rendered HTML? > > > > right then. i've added that onto my list of things to do. immediately after > "meet project deadline" and "...learning javacc and lucene inside and > out..." ;�) if anyone has such code they're willing to contribute i would > put it to good use. > > ----- Original Message ----- > From: Steven J. Owens <[EMAIL PROTECTED]> > To: Lucene Users List <[EMAIL PROTECTED]>; w i l l i a m__b o y > d <[EMAIL PROTECTED]> > Sent: Sunday, February 24, 2002 1:25 AM > Subject: Re: JSP Parser class wanted > > > w i l l i a m__b o y d <[EMAIL PROTECTED]> writes: > > > > > i have had some success in solving my problem. mind you, it is a > > > hack; a quick fix. it may or may not work for everyone. also the jsp > > > pages i am indexing/searching have very little dynamically generated > > > content. they are mostly static. > > > > If they're mostly static, why not just code a little crawler to > > request the pages via the web-server and parse the rendered HTML? > > > > Steven J. Owens > > [EMAIL PROTECTED] > > -- > To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> -- ======================= http://www.openwine.org -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
