Author: Alexander Barkov Email: b...@mnogosearch.org Message: Hello, > Hello, > > I would like to crawl the whole html code for each url.
Perhaps cached copy is what you're looking for. In 3.4.x cached copies are stored in a separate table "cachedcopy". Cached copies are compressed by default, but compression can be switched off: http://www.mnogosearch.org/doc34/msearch-cmdref-cachedcopyencoding.html > > Is there anyway to do this ? > > I've tried this in the indexer.conf but it doesn't work : > > Section headhtml 25 2058 "<head([^>]*)>(*.)</head>" $2 > Section bodyhtml 26 2058 "<body([^>]*)>(*.)</body>" $2 > Section htmlcode 25 2058 "<html([^>]*)>(*.)</html>" $2 > > Section body 1 2018 afterheaders html > gets the body but with all htlm tags stripped out :( > > > Thank you for your help > Reply: <http://www.mnogosearch.org/board/message.php?id=21773> _______________________________________________ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general