Hi: because they are actually the same page, you can only fine one. here is what i see when i use wget to fetch http://app02.laopdr.gov.la/:
C:\Documents and Settings\yanky>wget http://app02.laopdr.gov.la --2009-03-03 23:41:19-- http://app02.laopdr.gov.la/ Resolving app02.laopdr.gov.la... 203.110.66.105 Connecting to app02.laopdr.gov.la|203.110.66.105|:80... connected. HTTP request sent, awaiting response... 302 Moved Temporarily Location: http://app02.laopdr.gov.la/ePortal [following] --2009-03-03 23:41:20-- http://app02.laopdr.gov.la/ePortal Connecting to app02.laopdr.gov.la|203.110.66.105|:80... connected. HTTP request sent, awaiting response... 302 Moved Temporarily Location: http://app02.laopdr.gov.la/ePortal/ [following] --2009-03-03 23:41:20-- http://app02.laopdr.gov.la/ePortal/ Connecting to app02.laopdr.gov.la|203.110.66.105|:80... connected. HTTP request sent, awaiting response... 302 Moved Temporarily Location: http://app02.laopdr.gov.la/ePortal/home/home.action?request_locale=en_ US [following] --2009-03-03 23:41:21-- http://app02.laopdr.gov.la/ePortal/home/home.action?req uest_locale=en_US Connecting to app02.laopdr.gov.la|203.110.66.105|:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: `home.act...@request_locale=en_us' you must see that through several steps of 302 status, http://app02.laopdr.gov.la arrives at http://app02.laopdr.gov.la/ePortal/home/home.action?request_locale=en_US, so when nutch fetches http://app02.laopdr.gov.la, it actually fetches http://app02.laopdr.gov.la/ePortal/home/home.action?request_locale=en_US, so finally only the page content of http://app02.laopdr.gov.la/ePortal/home/home.action?request_locale=en_US is fetched and indexed. that doesn't have anything to do with dynamic pages. it is about how nutch process 302 status. good luck yanky 2009/3/4 Yves Yu <[email protected]> > thank you for your answer. > I'm feeling strange because http://app02.laopdr.gov.la/ just as same as > http://app02.laopdr.gov.la/ePortal/home/home.action?request_locale=en_US > but I cannot find it. > > you could see a few frames such as "Hot Event", "Businees" in > http://app02.laopdr.gov.la/ePortal/home/home.action?request_locale=en_US > when I copy a few words in these frames, I cannot find this homepage. > but nutch can find the page which in "more>>" by same words. > > I can see both http://app02.laopdr.gov.la/ and > http://app02.laopdr.gov.la/ePortal/home/home.action?request_locale=en_US > in my fetch log, but I just cannot find the page. > > I'm doubting about dynamic pages... is that reasonable? > > 2009/3/3 yanky young <[email protected]> > - 显示引用文字 - > > > Hi: > > > > Why do u think nutch can't find > > http://app02.laopdr.gov.la/ePortal/home/home.action?request_locale=en_US > > > > Actually http://app02.laopdr.gov.la/ is the same page as > > http://app02.laopdr.gov.la/ePortal/home/home.action?request_locale=en_US > > > > if you find http://app02.laopdr.gov.la in your log, the page you said > > must > > be downloaded.. > > > > good luck > > > > yanky > > > > 2009/3/3 Yves Yu <[email protected]> > > > > > Hi, all, > > > > > > I met a situation, need help, thank you in advance. > > > I added > > > http://app02.laopdr.gov.la/ > > > into urls.txt > > > > > > nutch can find > > > > > > > > > http://app02.laopdr.gov.la/ePortal/news/detail.action?id=10109&from=ePortal_NewsDetail_FromHome > > > > > > but nutch cannot find > > > > http://app02.laopdr.gov.la/ePortal/home/home.action?request_locale=en_US > > > > > > anybody has any idea? > > > > > > Yves > > > > > >
