Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
indexer sends If-Modified-Since header when doing reindexing.
So if your site is properly configured it will not send whole
document which is not changed since last indexing.
> I need to index BIG site remotely and I don't want indexer retrieve all pages that
>already indexed - can it check HEAD (timestamp/ETag) and if its same don't GET the
>page?
>
> ps. CVS seems broken..
It doesn't seem:
/usr/home/bar/mnogosearch/ etc > cvs update
cvs update: Updating .
cvs update: Updating create
cvs update: Updating create/ibase
cvs update: Updating create/msql
cvs update: Updating create/mssql
cvs update: Updating create/mysql
cvs update: Updating create/oracle
cvs update: Updating create/pgsql
cvs update: Updating create/sapdb
cvs update: Updating create/solid
cvs update: Updating create/stopwords
cvs update: Updating create/sybase
cvs update: Updating create/virtuoso
cvs update: Updating doc
cvs update: Updating doc/samples
cvs update: Updating etc
cvs update: Updating include
cvs update: Updating misc
cvs update: Updating src
/usr/home/bar/mnogosearch >
> pps. hint about internal DB: use BerkeleyDB (www.sleepycat.com) (default on most
>unixes)
>
Oh, yes. Thanks for suggestion. Ramil is already testing BDB
for a possibility to use it as a storage. The results are pretty good.
It seems msearch with BDB will be very fast.
Reply: <http://search.mnogo.ru/board/message.php?id=1974>
___________________________________________
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]