yes, you are right, the whole web has the two links..
but the web isnot created by me. If I have the opportunity, I will try
thank you very much for the help, Really helped me a lot of busy:)
2009/3/20 yanky young yanky.yo...@gmail.com
not really
i guess any page in this website
thanks
u can login in
http://app02.laopdr.gov.la/ePortal/news/detail.action?id=10110from=ePortal_NewsDetail_FromHome
and notice the upper right corner, have two translate , it can reach those
two urls
so i am worried .
2009/3/20 yanky young yanky.yo...@gmail.com
that must work, but it seems
I think my guess is right. I just see the code of that page.
those two urls are generated by javascript function:
function jump(lan)
in this case, nutch might not be that smart to recognize this kind of
generated url
but if you generated this two links from server side, and then the
urls in
not really
i guess any page in this website can have two links generated by javascript
function, that's why nutch can't find that url because nutch will not click
that link to trigger that js function as human does.
I suggest that, you can generated those multilingual links in server side,
for
Hi:
i guess the urls you mentioned are all directed to the same jsp or servlet,
apparently they all begin with
http://app02.laopdr.gov.la/ePortal/news/detail.actionhttp://app02.laopdr.gov.la/ePortal/news/detail.action?id=10110from=ePortal_NewsDetail_FromHome.
the difference is the request_locale
thanks..
the url is http://www.laopdr.gov.la/...
depth 15 topN1200 ...
seems must put
that must work, but it seems weird. u know, from the seed url you given,
nutch will crawl from the seed url and the whole crawled pages is actually a
tree. the root node is the seed url. if u can not reach those two urls from
the seed url by yourself, nutch can not too.
yanky
2009/3/20 陈琛