Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:

indexer sends If-Modified-Since header when doing reindexing.
So if your site is properly configured it will not send whole
document which is not changed since last indexing.

> I need to index BIG site remotely and I don't want indexer retrieve all pages that 
>already indexed - can it check HEAD (timestamp/ETag) and if its same don't GET the 
>page?
> 
> ps. CVS seems broken..

It doesn't seem:

/usr/home/bar/mnogosearch/ etc > cvs update
cvs update: Updating .
cvs update: Updating create
cvs update: Updating create/ibase
cvs update: Updating create/msql
cvs update: Updating create/mssql
cvs update: Updating create/mysql
cvs update: Updating create/oracle
cvs update: Updating create/pgsql
cvs update: Updating create/sapdb
cvs update: Updating create/solid
cvs update: Updating create/stopwords
cvs update: Updating create/sybase
cvs update: Updating create/virtuoso
cvs update: Updating doc
cvs update: Updating doc/samples
cvs update: Updating etc
cvs update: Updating include
cvs update: Updating misc
cvs update: Updating src
/usr/home/bar/mnogosearch > 


> pps. hint about internal DB: use BerkeleyDB (www.sleepycat.com) (default on most 
>unixes)
> 

Oh, yes. Thanks for suggestion. Ramil is already testing BDB 
for a possibility to use it as a storage. The results are pretty good. 
It seems msearch with BDB will be very fast.


Reply: <http://search.mnogo.ru/board/message.php?id=1974>

___________________________________________
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]

Reply via email to