Murat Ali Bayir wrote:
Hi everbody, I want to know how mergelinkdb function works. Assume that we have two linkdb in the first one the URLx is referred by URLa, URLb and URLc in the second one same URLx is refferred by URLa, URLk. I want to
know structure of the output linkdb.
does it contains one entry for URLx referred by URLa, URLb, URLc and URLk or just append second linkdb to first one and contains two entry for URLx given below
URLx <- URLa  URLb, URLc and
..
..
..
URLx <- URLa  URLk



No, these two entries are merged into one (that's why the name :) ). At any given time, in a valid linkdb there is exactly zero or one entries for any given target URL.

You should note that there is a limit set on how many inlinks we are going to store for any given URL (db.max.inlinks) - which may lead to some surprises. If e.g. the linkdbA already hit that limit, and the other linkdbB didn't, then two scenarios are possible - either you get the list just containing all links from linkdbA and none from linkdbB, or you get the list containing all links from linkdbB plus some links from linkdbA ...

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply via email to