hi... suppose i simply want to capture 'all' the information/text/html from a site if i want to mirror the site at this exact moment.. then i'd want to capture the forms (both GET/POST actions) are you saying that nutch wouldn't/shouldn't do this?
-bruce -----Original Message----- From: Fuad Efendi [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 21, 2006 9:41 PM To: [email protected] Subject: RE: following forms using nutch... According to HTTP/1.1 specs, POST method, page 55, RFC 2616: "Responses to this method are not cacheable, unless the response includes appropriate Cache-Control or Expires header fields." http://www.ietf.org/rfc/rfc2616.txt So, Nutch _should_not_ store anywhere information retrieved via POST... Web-Developers _expect_ that such pages won't be cached... Suppose we have a form on a forum (or simple 'Contact Me' form), will Nutch post dummy messages? It was fixed as a bug, and Nutch does not follow 'post' anymore (I believe...) Thanks -----Original Message----- From: Honda-Search Administrator Bruce, There is no reason you shouldn't be able to use POST, especially if you use the opensearch method to display your results. Matt ----- Original Message ----- From: "bruce" > hi... > > not sure whether this should be a dev/user question... > > some of the archives seem to indicate that nutch doesn't/can't/perhaps > shouldn't follow a form that uses POST... is this correct, and if it is, > can > someone tell me why? > > can nutch hand forms that use GET?? > > i'm looking to extract some information off of public college sites, and > some of the sites use POST, while others use GET with their forms... > > thanks > > -bruce Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
