hi...

suppose i simply want to capture 'all' the information/text/html from a site
if i want to mirror the site at this exact moment.. then i'd want to capture
the forms (both GET/POST actions) are you saying that nutch
wouldn't/shouldn't do this?

-bruce


-----Original Message-----
From: Fuad Efendi [mailto:[EMAIL PROTECTED]
Sent: Wednesday, June 21, 2006 9:41 PM
To: [email protected]
Subject: RE: following forms using nutch...


According to HTTP/1.1 specs,
POST method, page 55, RFC 2616:
"Responses to this method are not cacheable, unless the response
includes appropriate Cache-Control or Expires header fields."
http://www.ietf.org/rfc/rfc2616.txt

So, Nutch _should_not_ store anywhere information retrieved via POST...
Web-Developers _expect_ that such pages won't be cached...

Suppose we have a form on a forum (or simple 'Contact Me' form), will Nutch
post dummy messages? It was fixed as a bug, and Nutch does not follow 'post'
anymore (I believe...)

Thanks


-----Original Message-----
From: Honda-Search Administrator

Bruce,

There is no reason you shouldn't be able to use POST, especially if you use
the opensearch method to display your results.

Matt
----- Original Message -----
From: "bruce"
> hi...
>
> not sure whether this should be a dev/user question...
>
> some of the archives seem to indicate that nutch doesn't/can't/perhaps
> shouldn't follow a form that uses POST... is this correct, and if it is,
> can
> someone tell me why?
>
> can nutch hand forms that use GET??
>
> i'm looking to extract some information off of public college sites, and
> some of the sites use POST, while others use GET with their forms...
>
> thanks
>
> -bruce


Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to