Author: rafikCyc
Email:
Message:
I add those Disallow lines so that, both app can crawl the same
number of urls (approximatively 140),
Disallow */basket-villeurbanne/author/*
Disallow *?p=*
Disallow */feed
As it seems that Mnogosearch can manage a robots.txt, but not the
meta robots noindex,follow....
Here are the results :
-----
indexer -C;
indexer;
[18898]{01} Done (53 seconds, 168 documents, 3503752 bytes, 64.56
Kbytes/sec.)
------
indexer -C;
inexer -N5;
[19261]{02} Done (15 seconds, 46 documents, 982938 bytes, 63.99
Kbytes/sec.)
[19261]{03} Done (15 seconds, 48 documents, 930200 bytes, 60.56
Kbytes/sec.)
[19261]{01} Done (5 seconds, 14 documents, 323667 bytes, 63.22
Kbytes/sec.)
[19261]{05} Done (15 seconds, 46 documents, 974427 bytes, 63.44
Kbytes/sec.)
[19261]{04} Done (5 seconds, 14 documents, 292520 bytes, 57.13
Kbytes/sec.)
[19261]{--} Done (26 seconds, 168 documents, 3503752 bytes, 131.60
Kbytes/sec.)
indexer -C;
indexer -N50;
[20289]{11} Done (11 seconds, 28 documents, 585571 bytes, 51.99
Kbytes/sec.)
[20289]{28} Done (11 seconds, 29 documents, 705247 bytes, 62.61
Kbytes/sec.)
[20289]{16} Done (11 seconds, 30 documents, 635782 bytes, 56.44
Kbytes/sec.)
[20289]{30} Done (11 seconds, 30 documents, 635178 bytes, 56.39
Kbytes/sec.)
[20289]{--} Done (21 seconds, 168 documents, 3504392 bytes, 162.96
Kbytes/sec.)
mysql -uroot -p -N --database=db_test_mnogo --execute="SELECT url
FROM url" > ~/ALL.txt;
(cat ~/ALL.txt | parallel -j8 --gnu "wget {}");
real 0m10.638s
user 0m1.256s
sys 0m1.519s
---
Screaming Frog : 12s
It just confirm the fact that Mnogosearch is relatively slower than
Sreaming Frog, and even when i compare to parallel wget bash,
mnogosearch is slower.
It get little better with indexer -N50 though.
Reply: <http://www.mnogosearch.org/board/message.php?id=21767>
_______________________________________________
General mailing list
[email protected]
http://lists.mnogosearch.org/listinfo/general