I have managed to find it using direct query to the database: "select count(url_id) as cnt, site_id from urlword group by site_id order by cnt desc"
Problem was with these urls: http://www.angelfire.com/az/caucaus http://www.geocities.com/azavto/ indexer started to index all the angelfire and geocities... is there any way to restrict indexer to walk outside the path from Server directives and not only by domain? -----Original Message----- From: Emin Hasanov [mailto:[EMAIL PROTECTED]] Sent: Tuesday, January 01, 2002 4:51 PM To: [EMAIL PROTECTED] Subject: Learn the biggest site I've added some sites recently, which I didn't expected to be very big but indexing took more than 36 hours so I stopped it by running "./index -E". Mysql database is now over 1GB. Is there a way to learn which sites took the most place and to exclude them from the database? One more question. I have "safely stopped" index but still can't search on sites that were added recently (i am sure that at least half of them were fully indexed). Is it as it should be? Sincerely, Emin
