Quoting reply by al@ to aspseek forum at www.aspseek.org
This should help you. As for regex syntax in general, read
man pages for grep, awk etc. - it should be described there.

> > I would like to periodically clean the indexed
> > database from known sites, and or stop the crawler to go
> > to them. For example I do not want the crawler to index
> > amazon.com pages. 
> > 
> > When I started I did not put that restriction, so some
> > are added. How would I go about deleting them manually,
> > and dropping any references to them... 
> > 
> > Can I set a delete flag somwhere?? 
> > 
> > tia 
> > 
> ASPseek can't delete part of database. 
> You can try to workaround this limitation: 
> 1. Put statements "Disallow urlmask" to "aspseek.conf"
> (regular expressions must be used here), example:
> Disallow ^http://www\.amazon\.com/ 
> 2. Run "index -u urlmask1 -u urlmask2..." (SQL masks
> must be used here, 
> example: http://www.amazon.com/%) 
> 
> However, we never tried to do it before. 


[EMAIL PROTECTED] wrote:
> 
> > Try to disable your URL with appropriate regex, then reindex it. It should help.
> 
> Do you have an example syntax of this?
> For example say i want to delete http://www.somesite.com/
> 
> Also, Is it possible to just delete the url from mysql database in the
> sites table, or that will not work as might be expected.
> 
> Or will altering the entry for http://www.somesite.com/
>  in sites table to http://www.somesite.com.junk/  cause it to delete
> itself on next index?
> 
> Thank you
> 
> On Mon, 19 Mar 2001, Kir Kolyshkin wrote:
> 
> > Sorry, we have not implemented deleting -C with URL limits, so you will
> > indeed delete whole database.
> >
> > Try to disable your URL with appropriate regex, then reindex it. It should help.
> >
> > [EMAIL PROTECTED] wrote:
> > >
> > > What is the correct syntax to delete 1 site or a pattern of sites from the
> > > aspseek database?
> > >
> > > I am guesing it is:
> > > index -u "http://www.somesite.com/" -C
> > >
> > > However I am nervous trying this without asking first I have successfully
> > > indexxed 30,000 sites, and everything is very Stable right now!
> > > the output of index says -C will Clear database, and I just want to clear
> > > 1 URL...
> > >
> > > Thanks!
> >
> > --   [EMAIL PROTECTED]      http://kir.sever.net      ICQ 7551596   --
> > Answers: $1, short: $5, correct: $25, dumb questions are still free.
> > Now listening to Morcheeba - Moog Island
> >

--   [EMAIL PROTECTED]      http://kir.sever.net      ICQ 7551596   --
Answers: $1, short: $5, correct: $25, dumb questions are still free.
Now listening to Morcheeba - Moog Island

Reply via email to