|
I have written a set of scripts for htdig
to read through all of the posts in my Vbulletin
forum. It allows htdig to view every post on its own page. I then rewrite the urls
so that the user sees them in the “pretty” form when they do a
search. My problem is that my forum has almost 1M posts. Which means that that is 1M pages that htdig has to index. I let it run for about 8hrs and it only dug about 20% of
them. I need to find a way to make
the indexing more palatable to the server and was hoping someone can help me
here. Options I have considered. 1) Run a big dig (all 1M posts) then, run nightly digs of
the posts in the last 24-36 hours, then merge the dbs. 2) break the posts up into ~50-100k
page block and index them all separately, then merge the dbs. How do you guys update your dbs? Do I need to reindex
them all every time? Please help. Also how can I search multiple dbs
at once in 3.2? Are there any docs
for 3.2? Thanks -Rylan |
- RE: [htdig] htdig 3.2 LARGE site Rylan W. Hazelton
- RE: [htdig] htdig 3.2 LARGE site Matthew Nuzum
- RE: [htdig] htdig 3.2 LARGE site Rylan W. Hazelton
- Re: [htdig] htdig 3.2 LARGE site Geoff Hutchison
- RE: [htdig] htdig 3.2 LARGE site Rylan W. Hazelton
- RE: [htdig] htdig 3.2 LARGE site Geoff Hutchison
- [htdig] 2 More Questions Rylan W. Hazelton
- Re: [htdig] 2 More Questions Geoff Hutchison
- [htdig] problem after update I... Rylan W. Hazelton
- RE: [htdig] problem after ... Rylan W. Hazelton
- Re: [htdig] problem after ... Geoff Hutchison

