thanks..
               the url is http://www.laopdr.gov.la/...
depth 15 topN1200 ...

seems must put
http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=lo_LA&id=10110&from=ePortal_NewsDetail_FromHome%0A&;<http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=lo_LA&id=10110&from=ePortal_NewsDetail_FromHome%0A>
in
the urls directory



2009/3/19 yanky young <yanky.yo...@gmail.com>

> Hi:
>
> i guess the urls you mentioned are all directed to the same jsp or servlet,
> apparently they all begin with
> http://app02.laopdr.gov.la/ePortal/news/detail.action<
> http://app02.laopdr.gov.la/ePortal/news/detail.action?id=10110&from=ePortal_NewsDetail_FromHome
> >.
> the difference is the request_locale parameter. I have no idea how these
> two
> urls with different request_locale parameters are generated, but I guess
> nutch just don't know this request_locale parameters because this parameter
> may be added by javascript or backend content management system. Maybe u
> can
> write these links in a page that can be crawled by nutch. The point is that
> these links must can be found somewhere in your whole website pages. if
> not,
> they can not be found by nutch.
>
> good luck
>
> yanky
>
>
>
> 2009/3/19 陈琛 <kylin.chc...@gmail.com>
>
> > please help me, it is Urgent and Important, thanks
> >
> > ---------- Forwarded message ----------
> > From: 陈琛 <kylin.chc...@gmail.com>
> > Date: 2009/3/19
> > Subject: index web
> > To: nutch-user@lucene.apache.org
> >
> >
> > hi, all:
> >
> > i can get index url like
> >
> >
> http://app02.laopdr.gov.la/ePortal/news/detail.action?id=10110&from=ePortal_NewsDetail_FromHome
> >
> > but  cannot get index like
> >
> >
> http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=en_US&id=10110&from=ePortal_NewsDetail_FromHome
> > &<
> http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=en_US&id=10110&from=ePortal_NewsDetail_FromHome%0A&;
> >
> > and
> >
> >
> http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=lo_LA&id=10110&from=ePortal_NewsDetail_FromHome
> > &<
> http://app02.laopdr.gov.la/ePortal/news/detail.action?request_locale=lo_LA&id=10110&from=ePortal_NewsDetail_FromHome%0A&;
> >
>  >
> >
> > why not index ?
> > the web have any different?
> >
> > please notice "request_locale="
> >
> >
> > thanks
> >
>

Reply via email to