thanks u can login in http://app02.laopdr.gov.la/ePortal/news/detail.action?id=10110&from=ePortal_NewsDetail_FromHome
and notice the upper right corner, have two translate , it can reach those two urls so i am worried . 2009/3/20 yanky young <yanky.yo...@gmail.com> > that must work, but it seems weird. u know, from the seed url you given, > nutch will crawl from the seed url and the whole crawled pages is actually > a > tree. the root node is the seed url. if u can not reach those two urls from > the seed url by yourself, nutch can not too. > > yanky > > > 2009/3/20 陈琛 <kylin.chc...@gmail.com> > > > thanks.. > > the url is http://www.laopdr.gov.la/... > > depth 15 topN1200 ... > > > > seems must put > > > > > http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=lo_LA&id=10110&from=ePortal_NewsDetail_FromHome%0A& > > < > > > http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=lo_LA&id=10110&from=ePortal_NewsDetail_FromHome%0A > > > > > in > > the urls directory > > > > > > > > 2009/3/19 yanky young <yanky.yo...@gmail.com> > > > > > Hi: > > > > > > i guess the urls you mentioned are all directed to the same jsp or > > servlet, > > > apparently they all begin with > > > http://app02.laopdr.gov.la/ePortal/news/detail.action< > > > > > > http://app02.laopdr.gov.la/ePortal/news/detail.action?id=10110&from=ePortal_NewsDetail_FromHome > > > >. > > > the difference is the request_locale parameter. I have no idea how > these > > > two > > > urls with different request_locale parameters are generated, but I > guess > > > nutch just don't know this request_locale parameters because this > > parameter > > > may be added by javascript or backend content management system. Maybe > u > > > can > > > write these links in a page that can be crawled by nutch. The point is > > that > > > these links must can be found somewhere in your whole website pages. if > > > not, > > > they can not be found by nutch. > > > > > > good luck > > > > > > yanky > > > > > > > > > > > > 2009/3/19 陈琛 <kylin.chc...@gmail.com> > > > > > > > please help me, it is Urgent and Important, thanks > > > > > > > > ---------- Forwarded message ---------- > > > > From: 陈琛 <kylin.chc...@gmail.com> > > > > Date: 2009/3/19 > > > > Subject: index web > > > > To: nutch-user@lucene.apache.org > > > > > > > > > > > > hi, all: > > > > > > > > i can get index url like > > > > > > > > > > > > > > http://app02.laopdr.gov.la/ePortal/news/detail.action?id=10110&from=ePortal_NewsDetail_FromHome > > > > > > > > but cannot get index like > > > > > > > > > > > > > > http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=en_US&id=10110&from=ePortal_NewsDetail_FromHome > > > > &< > > > > > > http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=en_US&id=10110&from=ePortal_NewsDetail_FromHome%0A& > > > > > > > > and > > > > > > > > > > > > > > http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=lo_LA&id=10110&from=ePortal_NewsDetail_FromHome > > > > &< > > > > > > http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=lo_LA&id=10110&from=ePortal_NewsDetail_FromHome%0A& > > > > > > > > > > > > > > > > why not index ? > > > > the web have any different? > > > > > > > > please notice "request_locale=" > > > > > > > > > > > > thanks > > > > > > > > > >