Hey,I am confused with the crawling with nutch.
as you know,there are some website which can not be accessed becaused
they are the "post"method,that means,even if you know the web site's
url,when you input the url into the address bar on the IE or Mozilla,the
website 's some important content has lost.
what should I do,should I do a plugin to extend the crawling ?
eg:
http://www.51job.com/hot/show_job_detail.php?id=100655204&jobiduni=(102344234)

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to