Hi all, Currently, there is no API for google group. And now I want to get all the post information in one google group that I am in. So I tried to use nutch to crawl this google group, and wish to fetch all the post pages of this google group.
I am not sure if nutch can crawl google products like google group. This march, I successfully crawled some public pages using nutch. And I tried to crawl google group these days follow the authentication tutorial: http://wiki.apache.org/nutch/HttpAuthenticationSchemes. And I set the conf/httpclient-auth.xml like : <credentials username="susam" password="masus"> <default/> </credentials> But I can only fetch the first page which is in url seed. I cannot fetch those pages that contain the post information. Do I miss anything here. Can nutch crawl websites like google group? Thank you very much!! -- View this message in context: http://lucene.472066.n3.nabble.com/use-nutch-to-crawl-information-in-google-group-tp3985037.html Sent from the Nutch - User mailing list archive at Nabble.com.

