Author: Alexander Barkov
> I would like to crawl the whole html code for each url.
Perhaps cached copy is what you're looking for.
In 3.4.x cached copies are stored in a separate table "cachedcopy".
Cached copies are compressed by default, but compression can
be switched off:
> Is there anyway to do this ?
> I've tried this in the indexer.conf but it doesn't work :
> Section headhtml 25 2058 "<head([^>]*)>(*.)</head>" $2
> Section bodyhtml 26 2058 "<body([^>]*)>(*.)</body>" $2
> Section htmlcode 25 2058 "<html([^>]*)>(*.)</html>" $2
> Section body 1 2018 afterheaders html
> gets the body but with all htlm tags stripped out :(
> Thank you for your help
General mailing list