thanks

u can login in
http://app02.laopdr.gov.la/ePortal/news/detail.action?id=10110&from=ePortal_NewsDetail_FromHome

and notice the upper right corner, have two translate , it can reach those
two urls

so i am worried .
2009/3/20 yanky young <yanky.yo...@gmail.com>

> that must work, but it seems weird. u know, from the seed url you given,
> nutch will crawl from the seed url and the whole crawled pages is actually
> a
> tree. the root node is the seed url. if u can not reach those two urls from
> the seed url by yourself, nutch can not too.
>
> yanky
>
>
> 2009/3/20 陈琛 <kylin.chc...@gmail.com>
>
> > thanks..
> >               the url is http://www.laopdr.gov.la/...
> > depth 15 topN1200 ...
> >
> > seems must put
> >
> >
> http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=lo_LA&id=10110&from=ePortal_NewsDetail_FromHome%0A&;
> > <
> >
> http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=lo_LA&id=10110&from=ePortal_NewsDetail_FromHome%0A
> > >
> > in
> > the urls directory
> >
> >
> >
> > 2009/3/19 yanky young <yanky.yo...@gmail.com>
> >
> > > Hi:
> > >
> > > i guess the urls you mentioned are all directed to the same jsp or
> > servlet,
> > > apparently they all begin with
> > > http://app02.laopdr.gov.la/ePortal/news/detail.action<
> > >
> >
> http://app02.laopdr.gov.la/ePortal/news/detail.action?id=10110&from=ePortal_NewsDetail_FromHome
> > > >.
> > > the difference is the request_locale parameter. I have no idea how
> these
> > > two
> > > urls with different request_locale parameters are generated, but I
> guess
> > > nutch just don't know this request_locale parameters because this
> > parameter
> > > may be added by javascript or backend content management system. Maybe
> u
> > > can
> > > write these links in a page that can be crawled by nutch. The point is
> > that
> > > these links must can be found somewhere in your whole website pages. if
> > > not,
> > > they can not be found by nutch.
> > >
> > > good luck
> > >
> > > yanky
> > >
> > >
> > >
> > > 2009/3/19 陈琛 <kylin.chc...@gmail.com>
> > >
> > > > please help me, it is Urgent and Important, thanks
> > > >
> > > > ---------- Forwarded message ----------
> > > > From: 陈琛 <kylin.chc...@gmail.com>
> > > > Date: 2009/3/19
> > > > Subject: index web
> > > > To: nutch-user@lucene.apache.org
> > > >
> > > >
> > > > hi, all:
> > > >
> > > > i can get index url like
> > > >
> > > >
> > >
> >
> http://app02.laopdr.gov.la/ePortal/news/detail.action?id=10110&from=ePortal_NewsDetail_FromHome
> > > >
> > > > but  cannot get index like
> > > >
> > > >
> > >
> >
> http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=en_US&id=10110&from=ePortal_NewsDetail_FromHome
> > > > &<
> > >
> >
> http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=en_US&id=10110&from=ePortal_NewsDetail_FromHome%0A&;
> > > >
> > > > and
> > > >
> > > >
> > >
> >
> http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=lo_LA&id=10110&from=ePortal_NewsDetail_FromHome
> > > > &<
> > >
> >
> http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=lo_LA&id=10110&from=ePortal_NewsDetail_FromHome%0A&;
> > > >
> > >  >
> > > >
> > > > why not index ?
> > > > the web have any different?
> > > >
> > > > please notice "request_locale="
> > > >
> > > >
> > > > thanks
> > > >
> > >
> >
>

Reply via email to