[jira] [Commented] (SOLR-13571) Make recent RefGuide rank well in Google
[ https://issues.apache.org/jira/browse/SOLR-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874944#comment-16874944 ] Alexandre Rafalovitch commented on SOLR-13571: -- We could definitely do a sitemap. But also, we could update the redirect list and see if that makes a lot of difference. I had a quick look in the infra repo and it seems to be two files: (solr_id_to_new.map.txt and solr_name_to_new.map.txt). This seems to correspond to those we generated in SOLR-10595. So perhaps we just need to review those files for target file name changes (may be 99% same) and ask Infra to refresh files with new URL base of 8.1. Also, if we could get access to the Google Webmaster tools, that would be nice. It can be done by publishing a file to the server, can we do that outside of a full publication process. Finally, if we republish 6.6 with additional canonical header pointing to latest (or 8.1 or whatever), this may also refocus the search ranking. The work for that would probably be identical to that required to redo the maps. > Make recent RefGuide rank well in Google > > > Key: SOLR-13571 > URL: https://issues.apache.org/jira/browse/SOLR-13571 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Jan Høydahl >Priority: Major > > Spinoff from SOLR-13548 > The old Confluence ref-guide has a lot of pages pointing to it, and all of > that link karma is delegated to the {{/solr/guide/6_6/}} html ref guide, > making it often rank top. However we'd want newer content to rank high. See > these comments for some first ideas. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13571) Make recent RefGuide rank well in Google
[ https://issues.apache.org/jira/browse/SOLR-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874861#comment-16874861 ] Jan Høydahl commented on SOLR-13571: [~sokolov] a site map.xml is also an interesting idea, perhaps the easiest to try first, i.e. publish a sitemap with tons of weight to the 8_1 guide and decreasing weight the older you get. Or only mention the newest? If it plays out well then that's all we need. > Make recent RefGuide rank well in Google > > > Key: SOLR-13571 > URL: https://issues.apache.org/jira/browse/SOLR-13571 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Jan Høydahl >Priority: Major > > Spinoff from SOLR-13548 > The old Confluence ref-guide has a lot of pages pointing to it, and all of > that link karma is delegated to the {{/solr/guide/6_6/}} html ref guide, > making it often rank top. However we'd want newer content to rank high. See > these comments for some first ideas. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13571) Make recent RefGuide rank well in Google
[ https://issues.apache.org/jira/browse/SOLR-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874548#comment-16874548 ] Jan Høydahl commented on SOLR-13571: {quote}I'm not totally against the idea of having a "latest", but I don't quite get why it can't be a redirect? {quote} Today the "latest" redirect hack is not a landing page of its own, and it uses 302 redirect which I believe will not pass on page rank to the target. Take [https://lucene.apache.org/solr/guide/about-this-guide.html] which now redirects to [https://lucene.apache.org/solr/guide/8_1/about-this-guide.html]. Now the user will start sharing the 8_1 link and in a few years we have the same issue that the 8_1 guide has a lot of credit. Since the URL in browser changes, it is hard to bookmark and copy, so it won't get much use in the wild. If, on the other hand, we had a [https://lucene.apache.org/solr/guide/latest/about-this-guide.html] landing page, we could move the cwiki 301 redirect ([https://cwiki.apache.org/confluence/display/solr/About+This+Guide)] to the new stable location. I'm not sure though if Google already has moved all the rank points to the 6_6 HTML url or if moving the redirects again will suddenly make the /latest/ urls rank high. If the 6_6 guide still has all the points we could of course redirect all 6_6 links to "latest" as well, but then the 6_6 guide would be unreachable :). To fix that we could re-release the 6_6 guide under e.g. 6_6_0. The extra effort if we choose such a model is * Copy the generated guide twice to release repo, to two different locations * Make sure page renames are handled, e.g. as I proposed above, to track when a page that existed before no longer exists in the to-be-published guide, and then add a redirect for it to the latest version that had that page, or add a dummy page with a link on it. This would be scripted as part of release process - make a tool comparing the page tree between two versions. > Make recent RefGuide rank well in Google > > > Key: SOLR-13571 > URL: https://issues.apache.org/jira/browse/SOLR-13571 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Jan Høydahl >Priority: Major > > Spinoff from SOLR-13548 > The old Confluence ref-guide has a lot of pages pointing to it, and all of > that link karma is delegated to the {{/solr/guide/6_6/}} html ref guide, > making it often rank top. However we'd want newer content to rank high. See > these comments for some first ideas. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13571) Make recent RefGuide rank well in Google
[ https://issues.apache.org/jira/browse/SOLR-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874290#comment-16874290 ] Mike Sokolov commented on SOLR-13571: - Have we ever tried publishing a site map? Google used to have a feature that would read an XL file that described all the pages on the sure as a hint to its crawler. Also I wonder if we have ever checked out Google webmaster tools for the documentation site(s). > Make recent RefGuide rank well in Google > > > Key: SOLR-13571 > URL: https://issues.apache.org/jira/browse/SOLR-13571 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Jan Høydahl >Priority: Major > > Spinoff from SOLR-13548 > The old Confluence ref-guide has a lot of pages pointing to it, and all of > that link karma is delegated to the {{/solr/guide/6_6/}} html ref guide, > making it often rank top. However we'd want newer content to rank high. See > these comments for some first ideas. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13571) Make recent RefGuide rank well in Google
[ https://issues.apache.org/jira/browse/SOLR-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874253#comment-16874253 ] Alexandre Rafalovitch commented on SOLR-13571: -- I guess, one place to start thinking this through is on how important it is that users find the reference manual. As a reference, Stack Overflow (and rest of the network) have more focus on being discovered by Google than on their internal engines. Obviously, they have too, as that's where money and attention is. But it is still an interesting explicit goal post. For us, if the users cannot find a relevant reference guide page quickly, they may * think a particular feature does not exist * join and ask on the User Mailing list * discover the reference guide in general and browse through it * discover the reference guide and use our - still limited - internal search None of the options above seem optimal compared to leveraging the public search engine. But then, we have to worry about SEO. Clearly, the current SEO works well enough to get us to the 6.6 version of the guide and - very importantly - to a somewhat relevant page. Switching that to be a single target page would be easier for us, but may cost a lot of SEO. And, frankly, I am not at all sure that our guide is SEO-friendly enough on its own. I just did a search for MappingCharFilterFactory (as an example) and 6.6 RefGuide is at the top followed by (old) Javadoc, (old) Wiki, two source-code class links and then random websites and blogs. Latest version link just does not seem to appear in the first couple of pages (though 7.x clone of the RefGuide on some Chinese community site does). I suspect that Google is detecting multiple guide versions as duplicate content and therefore only displays one version and the 6.6 version has more weight due to redirects. But if we remove/collapse that link, I am not sure if the correct/latest version of the manual will be picked up. This feels risky to me. I don't know what the optimal solution is, given the limited resources available for this part of the project. I am just really worried that lost Google ranking is hard to get back. Perhaps, as a minimum step, we could just refresh the URL map periodically to use whatever latest version is. > Make recent RefGuide rank well in Google > > > Key: SOLR-13571 > URL: https://issues.apache.org/jira/browse/SOLR-13571 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Jan Høydahl >Priority: Major > > Spinoff from SOLR-13548 > The old Confluence ref-guide has a lot of pages pointing to it, and all of > that link karma is delegated to the {{/solr/guide/6_6/}} html ref guide, > making it often rank top. However we'd want newer content to rank high. See > these comments for some first ideas. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13571) Make recent RefGuide rank well in Google
[ https://issues.apache.org/jira/browse/SOLR-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874198#comment-16874198 ] Cassandra Targett commented on SOLR-13571: -- I'm not totally against the idea of having a "latest", but I don't quite get why it can't be a redirect? My gut reaction is that it further complicates the release process, and since I'm pretty much the only one who ever does it (with one recent exception), I'd like to be very sure that additional steps are necessary. I'd be more likely to get on board if you were able to spell out the specific changes to the release process that this would cause. Maybe it would be simpler to ask Infra to just change that big list of redirects to go to one single page that says "You have a link to the old version of the Ref Guide, here's where the latest versions are." Or just have it go to https://lucene.apache.org/solr/guide/. I mean, it's the internet - stuff moves and life pretty much goes on. Related to that idea, we need to institute a proper 404 page and redirect rule for it. There are also a large number of duplicated files in each release - CSS, fonts, images. I have been recently thinking I'd like to restructure everything so we stop uploading things that are very unlikely to change from release-to-release, but that is way beyond the scope here, and I don't have any concrete ideas there yet. I think it's worth asking if the value we'd get here is worth the effort of more steps to the process and more duplication of content. It's been 3 years since we moved. I agree that having the 6.6 Guide rank highest is not good. But perhaps we can fix that in a simpler way? > Make recent RefGuide rank well in Google > > > Key: SOLR-13571 > URL: https://issues.apache.org/jira/browse/SOLR-13571 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Jan Høydahl >Priority: Major > > Spinoff from SOLR-13548 > The old Confluence ref-guide has a lot of pages pointing to it, and all of > that link karma is delegated to the {{/solr/guide/6_6/}} html ref guide, > making it often rank top. However we'd want newer content to rank high. See > these comments for some first ideas. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13571) Make recent RefGuide rank well in Google
[ https://issues.apache.org/jira/browse/SOLR-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870915#comment-16870915 ] Jan Høydahl commented on SOLR-13571: Ideally I'd like us to have a real copy of the refGuide at [https://lucene.apache.org/solr/guide/latest/] instead of today a redirect that will rewrite the URL to latest if you omit the X_Y part. It would still be possible to permalink to a specific version of the guide, but if e.g. [https://lucene.apache.org/solr/guide/latest/solrcloud.html] would contain to 8_1 guide right now, and then once 8_2 is released, we publish it to both the "8_2" subfolder and the "latest" subfolder, and the rank authority of that "latest" URL would then remain, and over time hopefully grow strong? It would also make it way easier for people to link to the latest version if they do not care about version. We'd obviously need to sort out how to handle URL renames and deletions. Part of the release process could perhaps be to generate a list of all pages in existing "latest" and new guide to be released, and for every page that existed in X_Y but not in newest X_Z, we'd add a redirect rule to X_Y for that specific page, to make sure we don't break too many links on the "latest" guide. [~arafalov], [~ctargett] > Make recent RefGuide rank well in Google > > > Key: SOLR-13571 > URL: https://issues.apache.org/jira/browse/SOLR-13571 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Jan Høydahl >Priority: Major > > Spinoff from SOLR-13548 > The old Confluence ref-guide has a lot of pages pointing to it, and all of > that link karma is delegated to the {{/solr/guide/6_6/}} html ref guide, > making it often rank top. However we'd want newer content to rank high. See > these comments for some first ideas. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org