Author: duncan
Email: [EMAIL PROTECTED]
Message:
Hi- I seached the list, so forgive me if this has been covered.

I am struggling to get my results to be what i want...

I want to spider thru X sites, and grab _only_ .tgz files...  I want every resulting 
search result to point at a .tgz file.  I seem to be close, but it only gives the 
"index fox /some/path/to/tgz/" as the link.  

My search will be accrost ftp and http servers, and i have used varying combinations 
of the following rules:


#Allow Match .tgz
 
 
 
# Exclude Apache directory list in different sort order using "string" match:
Disallow *D=A *D=D *M=A *M=D *N=A *N=D *S=A *S=D
 
# More complicated case. RAR .r00-.r99, ARJ a00-a99 files
# and unix shared libraries. We use "Regex" match type here:
Disallow Regex \.r[0-9][0-9]$ \.a[0-9][0-9]$ \.so\.[0-9]$
 
CheckOnly *.tgz
#CheckOnly [^/]$
 
#HrefOnly Match NoCase \/$|\.html$|\.shtml$|\.phtml$|\.php$|\.txt$|\.htm$|\.tgz$
HrefOnly Match NoCase \/$|\.html$|\.shtml$|\.phtml$|\.php$|\.txt$|\.htm$
 
Allow Match .tgz /*
#Disallow *
 

 
UrlWeight 30
UrlFileWeight 30


TIA

duncan

Reply: <http://search.mnogo.ru/board/message.php?id=1950>

___________________________________________
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]

Reply via email to