On Dec 3, 2003, at 4:07 AM, Christoph Gummersbach wrote:

We have a problem with merging multiple databases:

(1) We run htdig/htmerge/htfuzzy on http://www.site1.com
     with site1-specific htdig.conf

(2) We run htdig/htmerge/htfuzzy on http://www.site2.com
     with site2-specific htdig.conf

(3) We copy site1 htdig.conf and all site1 db.* files to a new
     directory tree, say "allsites".

(4) We then run a kind of

htmerge -c allsites-htdig.conf -m site2-htdig.conf

(5) Then running a allsites htsearch shows that both databases
     are correctly merged, but the domain names of www.site2.com
     are overwritten by www.site1.com, i.e. the page paths of both
     databases are correct but all pages appear under www.site1.com
     urls.

This is of course not the intended behavior. I use much the same approach for a number of sites and all of the original URLs are maintained throughout the process. What version of ht://Dig are you using? Have you double checked your configuration files for things like url_part_aliases, url_rewrite_rules, search_rewrite_rules, etc. that might be modifying your URLs? Have you tried searching the databases before the final merge to ensure that the URLs are still correct at that point? You might also want to try running htdump and checking the URLs in the resulting db.docs file.


Jim



-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
ht://Dig general mailing list: <[EMAIL PROTECTED]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to