Hi,

this is a great tool to retrieve and scrape html pages (rendered or not)...

http://www.research.compaq.com/SRC/WebL/

:-)

Chris Opler

w i l l i a m__b o y d wrote:

> >      If they're mostly static, why not just code a little crawler to
> > request the pages via the web-server and parse the rendered HTML?
> >
>
> right then. i've added that onto my list of things to do. immediately after
> "meet project deadline" and "...learning javacc and lucene inside and
> out..." ;�) if anyone has such code they're willing to contribute i would
> put it to good use.
>
> ----- Original Message -----
> From: Steven J. Owens <[EMAIL PROTECTED]>
> To: Lucene Users List <[EMAIL PROTECTED]>; w i l l i a m__b o y
> d <[EMAIL PROTECTED]>
> Sent: Sunday, February 24, 2002 1:25 AM
> Subject: Re: JSP Parser class wanted
>
> > w i l l i a m__b o y d <[EMAIL PROTECTED]> writes:
> >
> > > i have had some success in solving my problem. mind you, it is a
> > > hack; a quick fix. it may or may not work for everyone. also the jsp
> > > pages i am indexing/searching have very little dynamically generated
> > > content. they are mostly static.
> >
> >      If they're mostly static, why not just code a little crawler to
> > request the pages via the web-server and parse the rendered HTML?
> >
> > Steven J. Owens
> > [EMAIL PROTECTED]
>
> --
> To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

--
=======================
http://www.openwine.org



--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to