Hi

pls change the value of "db.max.outlinks.per.page"(default is 100)
property to say 1000.

<property>
  <name>db.max.outlinks.per.page</name>
  <value>1000</value>
  <description>The maximum number of outlinks that we'll process for a page.
  </description>
</property>

/Jack

On 1/20/06, Nguyen Ngoc Giang <[EMAIL PROTECTED]> wrote:
>   Hi everyone,
>
>   I found that getOutlinks function in html-parser/DOMContentUtils.java
> doesn't work correctly for some cases. An example is this website:
> http://blog.donews.com/boyla/. The function returns only 170 records, while
> in fact it contains a lot more (Firefox returns 356 links!).
>
>   When I compare the hyperlink list with the one returned by Firefox, the
> orders are exactly identical, meaning that the 170th link of getOutlinks
> function is the same as the 170th link of Firefox. Therefore, it seems that
> the algorithm is correct, but there is some bug around. There is no
> threshold at this point, since the max outlinks parameter is set at updatedb
> part. Even when I increase the max outlinks to 1000, the situation still
> remains.
>
>   Any suggestions are very appreciated.
>
>   Regards,
>   Giang
>
>


--
Keep Discovering ... ...
http://www.jroller.com/page/jmars

Reply via email to