Yuriy Soroka
Wed, 18 Sep 2002 09:09:49 -0700
Yes, I have indexed 255 179 URLs I was indexing by 20000 - 40000 URLs var dir size - 1.5 Gb I can't say for certain size of mysql database. Hardware 2 CPU 1.1 GHz each, about 1.5 G of RAM OS - FreeBSD 4.5 release p6 no special kernel/mysql tuning was done. ----- Original Message ----- From: "Gregory Kozlovsky" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Wednesday, September 18, 2002 7:05 PM Subject: RE: [aseek-devel] How to index external list of URLs? > This is interesting. Can you share with us the size of your database (in > docs and in GB), > details of your hardware, and tuning of the Linux kernel and the mysql > server? > > Gregory Kozlovsky > > -----Original Message----- > From: Yuriy Soroka [mailto:[EMAIL PROTECTED]] > Sent: Mittwoch, 18. September 2002 02:43 > To: [EMAIL PROTECTED] > Subject: Re: [aseek-devel] How to index external list of URLs? > > > Why don't you just include them to aspseek.conf > > I indexed 250 000 urls. > > Include myfile.txt > > > ----- Original Message ----- > From: "J and T" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]> > Sent: Wednesday, September 18, 2002 3:10 AM > Subject: [aseek-devel] How to index external list of URLs? > > > > How in the world do you index a list of URLs NOT in the aspseek.conf? I > have > > tried everything I can think of: > > > > ./index -i -f myfile.txt > > ./index -N 100 > > > > Doesn't work. The myfile.txt lists 5,000 URLs like this: > > > > Server http://someserver.com/ > > > > But when I run the above (ie, ./index -i -f myfile.txt) > > > > I get the following error: > > > > Bad URL: Server http://someserver.com/ > > > > So I removed the "Server " so now it reads: > > > > http://someserver.com/ > > > > Did the same thing: > > > > ./index -i -f myfile.txt > > > > Now it shows them in the database: > > > > ./index -S > > > > ASPseek database statistics > > > > Status Expired Total > > ----------------------------- > > 0 5000 5000 Not indexed yet > > ----------------------------- > > Total 5000 5000 > > > > So now I try to run the indexer: > > > > ./index -N 100 > > > > And now the indexer gives the same damm error: > > > > No "Server" command for URL http://www.someserver.com/ - deleted. > > ( 0 1 1 0 0 0 0 21) Adding URL: http://www.someserver.com/ > > > > So all it did was delete all these URLs. I have tried every other > > combination I can think of after reviewing the ./index -h, but nothing > seems > > to work. How in the word do you get these indexed using an external file? > > > > Also before when I hard coded all URLs in aspseek.conf there were about > 200 > > URLs which were always shown as "Not Yet Index". How in the heck do you > get > > them index or delete the damm things? > > > > It doesn't make sense to have to add thousands of URLs in the aspseek.conf > > file every time you want to add new URLs to the list. You certainly don't > > want to set the system to reindex everything specially if you just added > > 5,000 URLs the day before. That would use unecessary bandwidth to say the > > least. > > > > Anyone have any suggestions? > > > > end. > > > > _________________________________________________________________ > > Chat with friends online, try MSN Messenger: http://messenger.msn.com > > > >