thanks.. the url is http://www.laopdr.gov.la/... depth 15 topN1200 ...
seems must put http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=lo_LA&id=10110&from=ePortal_NewsDetail_FromHome%0A&<http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=lo_LA&id=10110&from=ePortal_NewsDetail_FromHome%0A> in the urls directory 2009/3/19 yanky young <yanky.yo...@gmail.com> > Hi: > > i guess the urls you mentioned are all directed to the same jsp or servlet, > apparently they all begin with > http://app02.laopdr.gov.la/ePortal/news/detail.action< > http://app02.laopdr.gov.la/ePortal/news/detail.action?id=10110&from=ePortal_NewsDetail_FromHome > >. > the difference is the request_locale parameter. I have no idea how these > two > urls with different request_locale parameters are generated, but I guess > nutch just don't know this request_locale parameters because this parameter > may be added by javascript or backend content management system. Maybe u > can > write these links in a page that can be crawled by nutch. The point is that > these links must can be found somewhere in your whole website pages. if > not, > they can not be found by nutch. > > good luck > > yanky > > > > 2009/3/19 陈琛 <kylin.chc...@gmail.com> > > > please help me, it is Urgent and Important, thanks > > > > ---------- Forwarded message ---------- > > From: 陈琛 <kylin.chc...@gmail.com> > > Date: 2009/3/19 > > Subject: index web > > To: nutch-user@lucene.apache.org > > > > > > hi, all: > > > > i can get index url like > > > > > http://app02.laopdr.gov.la/ePortal/news/detail.action?id=10110&from=ePortal_NewsDetail_FromHome > > > > but cannot get index like > > > > > http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=en_US&id=10110&from=ePortal_NewsDetail_FromHome > > &< > http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=en_US&id=10110&from=ePortal_NewsDetail_FromHome%0A& > > > > and > > > > > http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=lo_LA&id=10110&from=ePortal_NewsDetail_FromHome > > &< > http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=lo_LA&id=10110&from=ePortal_NewsDetail_FromHome%0A& > > > > > > > > why not index ? > > the web have any different? > > > > please notice "request_locale=" > > > > > > thanks > > >