Hi pls change the value of "db.max.outlinks.per.page"(default is 100) property to say 1000.
<property> <name>db.max.outlinks.per.page</name> <value>1000</value> <description>The maximum number of outlinks that we'll process for a page. </description> </property> /Jack On 1/20/06, Nguyen Ngoc Giang <[EMAIL PROTECTED]> wrote: > Hi everyone, > > I found that getOutlinks function in html-parser/DOMContentUtils.java > doesn't work correctly for some cases. An example is this website: > http://blog.donews.com/boyla/. The function returns only 170 records, while > in fact it contains a lot more (Firefox returns 356 links!). > > When I compare the hyperlink list with the one returned by Firefox, the > orders are exactly identical, meaning that the 170th link of getOutlinks > function is the same as the 170th link of Firefox. Therefore, it seems that > the algorithm is correct, but there is some bug around. There is no > threshold at this point, since the max outlinks parameter is set at updatedb > part. Even when I increase the max outlinks to 1000, the situation still > remains. > > Any suggestions are very appreciated. > > Regards, > Giang > > -- Keep Discovering ... ... http://www.jroller.com/page/jmars
