[jira] [Comment Edited] (FLINK-9541) Add robots.txt and sitemap.xml to Flink website

2018-11-19 Thread Ken Krugler (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689936#comment-16689936
 ] 

Ken Krugler edited comment on FLINK-9541 at 11/19/18 11:38 PM:
---

I'd asked [on the bui...@apache.org|mailto:on%c2%a0the%c2%a0bui...@apache.org] 
about setting this up, but didn't hear back. Turns out Gavin McDonald had 
[responded|http://mail-archives.apache.org/mod_mbox/www-builds/201806.mbox/%3C21B85DEA-438A-42F0-8FAE-F25820F396A9%4016degrees.com.au%3E]...
{quote}Ok Ken and anyone else interested. I have updated the robots.txt [1] 
file to point to a sitemap-index.xml [2] file. So, all you now need to do is 
ensure you have a flink.xml.gz sitemap in ci.apache.org/projects/flink 
<[http://ci.apache.org/projects/flink]> and create a PR against our 
sitemap-index.xml file, and done, hopefully.
{quote}
I can create the sitemap file and build the pull request, but it would be good 
to get some input on what to put in the sitemap. For example, as a first cut it 
would be easiest to just have 
[https://ci.apache.org/projects/flink/flink-docs-stable/] as the only docs, as 
(I assume) that's what we'd want most people to find if they were doing a 
search without a version number in the query, yes? Maybe [~fhueske] can weigh 
in here.


was (Author: kkrugler):
I'd asked [on the bui...@apache.org|mailto:on%c2%a0the%c2%a0bui...@apache.org] 
about setting this up, but didn't hear back. Turns out Gavin McDonald had 
responded..[.|http://mail-archives.apache.org/mod_mbox/www-builds/201806.mbox/%3C21B85DEA-438A-42F0-8FAE-F25820F396A9%4016degrees.com.au%3E]
{quote}Ok Ken and anyone else interested. I have updated the robots.txt [1] 
file to point to a sitemap-index.xml [2] file. So, all you now need to do is 
ensure you have a flink.xml.gz sitemap in ci.apache.org/projects/flink 
<[http://ci.apache.org/projects/flink]> and create a PR against our 
sitemap-index.xml file, and done, hopefully.
{quote}
I can create the sitemap file and build the pull request, but it would be good 
to get some input on what to put in the sitemap. For example, as a first cut it 
would be easiest to just have 
[https://ci.apache.org/projects/flink/flink-docs-stable/] as the only docs, as 
(I assume) that's what we'd want most people to find if they were doing a 
search without a version number in the query, yes? Maybe [~fhueske] can weigh 
in here.

> Add robots.txt and sitemap.xml to Flink website
> ---
>
> Key: FLINK-9541
> URL: https://issues.apache.org/jira/browse/FLINK-9541
> Project: Flink
>  Issue Type: Improvement
>  Components: Project Website
>Reporter: Fabian Hueske
>Priority: Major
>
> From the [dev mailing 
> list|https://lists.apache.org/thread.html/71ce1bfbed1cf5f0069b27a46df1cd4dccbe8abefa75ac85601b088b@%3Cdev.flink.apache.org%3E]:
> {quote}
> It would help to add a sitemap (and the robots.txt required to reference it) 
> for flink.apache.org and ci.apache.org (for /projects/flink)
> You can see what Tomcat did along these lines - 
> http://tomcat.apache.org/robots.txt references 
> http://tomcat.apache.org/sitemap.xml, which is a sitemap index file pointing 
> to http://tomcat.apache.org/sitemap-main.xml
> By doing this, you can emphasize more recent versions of docs. There are 
> other benefits, but reducing poor Google search results (to me) is the 
> biggest win.
> E.g.  https://www.google.com/search?q=flink+reducingstate 
>  (search on flink 
> reducing state) shows the 1.3 Javadocs (hit #1), master (1.6-SNAPSHOT) 
> Javadocs (hit #2), and then many pages of other results.
> Whereas the Javadocs for 1.5 
> 
>  and 1.4 
> 
>  are nowhere to be found.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (FLINK-9541) Add robots.txt and sitemap.xml to Flink website

2018-11-19 Thread Ken Krugler (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689936#comment-16689936
 ] 

Ken Krugler edited comment on FLINK-9541 at 11/19/18 11:37 PM:
---

I'd asked [on the bui...@apache.org|mailto:on%c2%a0the%c2%a0bui...@apache.org] 
about setting this up, but didn't hear back. Turns out Gavin McDonald had 
responded..[.|http://mail-archives.apache.org/mod_mbox/www-builds/201806.mbox/%3C21B85DEA-438A-42F0-8FAE-F25820F396A9%4016degrees.com.au%3E]
{quote}Ok Ken and anyone else interested. I have updated the robots.txt [1] 
file to point to a sitemap-index.xml [2] file. So, all you now need to do is 
ensure you have a flink.xml.gz sitemap in ci.apache.org/projects/flink 
<[http://ci.apache.org/projects/flink]> and create a PR against our 
sitemap-index.xml file, and done, hopefully.
{quote}
I can create the sitemap file and build the pull request, but it would be good 
to get some input on what to put in the sitemap. For example, as a first cut it 
would be easiest to just have 
[https://ci.apache.org/projects/flink/flink-docs-stable/] as the only docs, as 
(I assume) that's what we'd want most people to find if they were doing a 
search without a version number in the query, yes? Maybe [~fhueske] can weigh 
in here.


was (Author: kkrugler):
I'd asked [on the bui...@apache.org|mailto:on%c2%a0the%c2%a0bui...@apache.org] 
about setting this up, but didn't hear back. Turns out Gavin McDonald had 
responded...
{quote}Ok Ken and anyone else interested. I have updated the robots.txt [1] 
file to point to a sitemap-index.xml [2] file. So, all you now need to do is 
ensure you have a flink.xml.gz sitemap in ci.apache.org/projects/flink 
 and create a PR against our 
sitemap-index.xml file, and done, hopefully.
{quote}
I can create the sitemap file and build the pull request, but it would be good 
to get some input on what to put in the sitemap. For example, as a first cut it 
would be easiest to just have 
[https://ci.apache.org/projects/flink/flink-docs-stable/] as the only docs, as 
(I assume) that's what we'd want most people to find if they were doing a 
search without a version number in the query, yes? Maybe [~fhueske] can weigh 
in here.

> Add robots.txt and sitemap.xml to Flink website
> ---
>
> Key: FLINK-9541
> URL: https://issues.apache.org/jira/browse/FLINK-9541
> Project: Flink
>  Issue Type: Improvement
>  Components: Project Website
>Reporter: Fabian Hueske
>Priority: Major
>
> From the [dev mailing 
> list|https://lists.apache.org/thread.html/71ce1bfbed1cf5f0069b27a46df1cd4dccbe8abefa75ac85601b088b@%3Cdev.flink.apache.org%3E]:
> {quote}
> It would help to add a sitemap (and the robots.txt required to reference it) 
> for flink.apache.org and ci.apache.org (for /projects/flink)
> You can see what Tomcat did along these lines - 
> http://tomcat.apache.org/robots.txt references 
> http://tomcat.apache.org/sitemap.xml, which is a sitemap index file pointing 
> to http://tomcat.apache.org/sitemap-main.xml
> By doing this, you can emphasize more recent versions of docs. There are 
> other benefits, but reducing poor Google search results (to me) is the 
> biggest win.
> E.g.  https://www.google.com/search?q=flink+reducingstate 
>  (search on flink 
> reducing state) shows the 1.3 Javadocs (hit #1), master (1.6-SNAPSHOT) 
> Javadocs (hit #2), and then many pages of other results.
> Whereas the Javadocs for 1.5 
> 
>  and 1.4 
> 
>  are nowhere to be found.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (FLINK-9541) Add robots.txt and sitemap.xml to Flink website

2018-11-19 Thread Chesnay Schepler (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16691517#comment-16691517
 ] 

Chesnay Schepler edited comment on FLINK-9541 at 11/19/18 10:17 AM:


Would this still have the desired effect of hiding 1.3 docs without 
regenerating them specifically?

We already added canonical link elements to point to the latest version; sounds 
like we're duplicating efforts here. They just don't apply to 1.3- since the 
build environment is currently broken.


was (Author: zentol):
Would this still have the desired effect of hiding 1.3 docs without 
regenerating them specifically?

We already added canonical link elements to point to the latest version; sounds 
like we're duplicating efforts here.

> Add robots.txt and sitemap.xml to Flink website
> ---
>
> Key: FLINK-9541
> URL: https://issues.apache.org/jira/browse/FLINK-9541
> Project: Flink
>  Issue Type: Improvement
>  Components: Project Website
>Reporter: Fabian Hueske
>Priority: Major
>
> From the [dev mailing 
> list|https://lists.apache.org/thread.html/71ce1bfbed1cf5f0069b27a46df1cd4dccbe8abefa75ac85601b088b@%3Cdev.flink.apache.org%3E]:
> {quote}
> It would help to add a sitemap (and the robots.txt required to reference it) 
> for flink.apache.org and ci.apache.org (for /projects/flink)
> You can see what Tomcat did along these lines - 
> http://tomcat.apache.org/robots.txt references 
> http://tomcat.apache.org/sitemap.xml, which is a sitemap index file pointing 
> to http://tomcat.apache.org/sitemap-main.xml
> By doing this, you can emphasize more recent versions of docs. There are 
> other benefits, but reducing poor Google search results (to me) is the 
> biggest win.
> E.g.  https://www.google.com/search?q=flink+reducingstate 
>  (search on flink 
> reducing state) shows the 1.3 Javadocs (hit #1), master (1.6-SNAPSHOT) 
> Javadocs (hit #2), and then many pages of other results.
> Whereas the Javadocs for 1.5 
> 
>  and 1.4 
> 
>  are nowhere to be found.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)