aseek-devel  

Re: [aseek-devel] How to index external list of URLs?

Kir Kolyshkin
Wed, 18 Sep 2002 00:24:02 -0700

J and T wrote:
> How in the world do you index a list of URLs NOT in the aspseek.conf? I 
> have tried everything I can think of:
> 
> ./index -i -f myfile.txt
> ./index -N 100
> 
> Doesn't work. The myfile.txt lists 5,000 URLs like this:
> 
> Server http://someserver.com/
> 
> But when I run the above (ie, ./index -i -f myfile.txt)
> 
> I get the following error:
> 
> Bad URL: Server http://someserver.com/
> 
> So I removed the "Server " so now it reads:
> 
> http://someserver.com/
> 
> Did the same thing:
> 
> ./index -i -f myfile.txt
> 
> Now it shows them in the database:
> 
> ./index -S
> 
> ASPseek database statistics
> 
>    Status    Expired      Total
>   -----------------------------
>         0       5000       5000 Not indexed yet
>   -----------------------------
>     Total       5000       5000
> 
> So now I try to run the indexer:
> 
> ./index -N 100
> 
> And now the indexer gives the same damm error:
> 
> No "Server" command for URL http://www.someserver.com/ - deleted.
> ( 0  1  1  0  0  0  0 21) Adding URL: http://www.someserver.com/
> 
> So all it did was delete all these URLs. I have tried every other 
> combination I can think of after reviewing the ./index -h, but nothing 
> seems to work. How in the word do you get these indexed using an 
> external file?
> 
> Also before when I hard coded all URLs in aspseek.conf there were about 
> 200 URLs which were always shown as "Not Yet Index". How in the heck do 
> you get them index or delete the damm things?
> 
> It doesn't make sense to have to add thousands of URLs in the 
> aspseek.conf file every time you want to add new URLs to the list. You 
> certainly don't want to set the system to reindex everything specially 
> if you just added 5,000 URLs the day before. That would use unecessary 
> bandwidth to say the least.
> 
> Anyone have any suggestions?

Yes. Set "AllowOutside" to "yes" in aspseek.conf.


-- 
-- [EMAIL PROTECTED]  ICQ7551596  [EMAIL PROTECTED] --
    Guinness a Day Keeps a Doctor Away (people's wisdom)