Author: Alexander Barkov Email: b...@mnogosearch.org Message: > Actually, the way of using tag or categories is perfect but, i don't want > to crawl again the whole site because i didn't write my tagging rule in > the correct way the first time.
This task consists of two parts: a. update what you have in the tables "server" and "srvinfo". This is done automatically when you start crawling. "indexer -n0" will do this. Note, this is enough when you just need to rename some tag to a new value. But usually this is not enough, as you might want to redistribute documents between tags (i.e. split a single tag into multiple ones, or join multiple tags into a single one, or do some more complex redistribution). In these cases part "b" is also needed. b. update the table "url" to refer to the table "server" properly. There is no a special command for this. Normally, documents are updated properly only when they're crawled next time. But there is a trick to use "Skip" option temporarily, to avoid real downloading. Suppose you want to split the section of your site into two subsections and assign different tags for them. What you do is: 1. Change indexer.conf: # Remove the old command Tag doc Server http://host/doc/ # And add two new commands instead Tag doca Server skip http://host/doc/a/ Tag docb Server skip http://host/doc/b/ Notice the "skip" option in the new commands. 2. Run "indexer -am -u 'http://host/doc/%'" It will a kind "crawl" all documents, but without real downloading. It will actually only nothing else but execute a query like this for every document: UPDATE url SET status=200,next_index_time=1418965297, site_id=-1519382294,server_id=-1738492707 WHERE rec_id=259; 3. Make sure not to forget to remove the "skip" options from the new "Server" commands in indexer.conf. 4. Check that everything went well: SELECT server.tag,url.url FROM url,server WHERE url.server_id=server.rec_id; Reply: <http://www.mnogosearch.org/board/message.php?id=21669> _______________________________________________ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general