Author: Alexander Barkov

> Hello,
> I couldn't find any information on this subject.
> As people start using HTTPS, I get more and more problems when crawling with 
> links that don't use a specific protocol.
> Let's take this example of a link from :
> <a href="//">text</a>
> Will be seen as :
> And of course will cause a 404 error.
> Any idea on how to get the right links ?
> Thanks.

The crawler stores full URLs in the database.
But you can remove the protocol at search time,
using the search template language functionality.

In 3.4.x use regex_substr:

In 3.3.x use the EREG template operator:

Reply: <>

General mailing list

Reply via email to