zregvart commented on pull request #724:
URL: https://github.com/apache/camel-website/pull/724#issuecomment-1049268827


   As a reminder the deadline for migrating away from DocSearch is 15.3.2022, 
at which point the index will be served but DocSearch crawler will no longer 
update the index.
   
   This is progressing nicely, I have figured out how to configure the crawler 
and this allows us great flexibility in comparison to the DocSearch crawler we 
currently use. And the configuration in place now at Algolia Crawler now 
reflects that, e.g. there are separate record extraction configurations for 
different parts of the website. This will probably need more refinement 
feedback on the search performance is very welcome.
   
   When examining the search results please do make sure that you're still on 
the preview URL (https://pr-724--camel.netlify.app/) as it is easy to follow a 
link from the search result and land on the production website 
(https://camel.apache.org) where old index is still used.
   
   There is one known issue however: the canonical link for each page points to 
the latest released version, e.g. currently for component reference this is 
3.15.0, and the crawler indexing pages from a different version ignores those. 
There is a good explanation of this behavior in the [Antora 
documentation](https://docs.antora.org/antora/latest/playbook/site-url/#how-the-canonical-url-works).
 That means that only the latest version of the versioned documentation is 
indexed. This is not how the DocSearch Crawler behaves, it seems to disregard 
the canonical link.
   
   For that we have (I think) several options:
   - accept that only the latest version is indexed
   - remove canonical links
   - manually crawl and [upload 
data](https://www.algolia.com/doc/guides/sending-and-managing-data/send-and-update-your-data/)
 to Algolia
   - (not sure this is an option) set canonical links to actual page URLs
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to