Hi again. You'd get a faster reply if you kept your questions on the list. See http://www.htdig.org/FAQ.html#q1.17
According to Alan Jiang: > Thanks for your reply to my questions. Now I understand how the "to" > string works. I tried it out for my website but came up with some errors. > This is what I put in the configuration files: > > 1)htdig.conf: (for htdig) > > url_part_aliases: http://accounting.rutgers.edu/raw/fasb http://temp.org > > 2) fasb.conf: (for htsearch) > > url_part_aliases: http://www.fasb.org http://temp.org Well, your choice of "to" string above suggests to me that you don't REALLY understand how the "to" string works. Why are you using a long URL part instead of a short, internal encoding like the *1 in the documentation's examples? Reread http://www.htdig.org/FAQ.html#q4.17 carefully, especially the 3rd paragraph. > what I am trying to do is to convert the URL displayed in the search > results from > http://accounting.rutgers.edu/raw/fasb > > to > > http://www.fasb.org You're almost there. Just pick a "to" string that doesn't conflict with the strings in common_url_parts and you should be able to do this. > I then delete all the files in directory db, run the "rundig -c htdig.conf > -a" command. However, when I run the search, I got the following message: > > -------------------------------------------------------------------- > htsearch detected an error. Please report this to the webmaster of this > site. The error message is: > > Unable to read word database file > Did you run htmerge? > ------------------------------------------------------------------- > > I am not sure what to do about this because the rundig command is supposed > to combine the htdig and htmerge procedure. The permission for the word > database file is also world readable. > > I'd appreciate it if you could help me solve this problem. Thanks a lot. This would suggest that htmerge isn't running to completion. Try one or more -v options on the rundig command and see where the error is happening. See http://www.htdig.org/FAQ.html#q4.1 > According to Alan Jiang: > > I have a question about rewritting the URLs in the search results. I am > > not sure how to set up the url_part_aliases configuration file > > attribute. The document on htdig is kind of arcane because I have no > clue > > what those *1, *2 stand for. > > The description at http://www.htdig.org/attrs.html#url_part_aliases says, > among other things, that "the choice of to-strings is pretty arbitrary, > as they just provide a temporary, internal encoding in the databases, and > none of the characters in these strings have any special meaning." So, > they don't stand for anything - it's simple string replacement, one way > during indexing and the opposite way when pulling URLs out of the index. > > > I need to replace > > http://xxx.com/raw/test > > > > with > > > > http://www.test.com > > > > I know that you need two configuration files with two identical "to" > > strings but different "from" strings. I played with it myself but it > > didn't work out very well. > > Any suggestions will be greatly appreciated. > > See also http://www.htdig.org/FAQ.html#q4.17 -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) _______________________________________________________________ Hundreds of nodes, one monster rendering program. Now that�s a super model! Visit http://clustering.foundries.sf.net/ _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

