[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-11-20 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16693364#comment-16693364
 ] 

ASF subversion and git services commented on SOLR-12746:


Commit 4efaecac34159ee1b718c548530d20d80c1fb4bf in lucene-solr's branch 
refs/heads/jira/http2 from [~ctargett]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4efaeca ]

SOLR-12746: Fix formatting for callout list numbers and preamble sections


> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
> Fix For: 7.6, master (8.0)
>
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-11-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692261#comment-16692261
 ] 

ASF subversion and git services commented on SOLR-12746:


Commit bbe9511894340df14b938565cf3575540ca2f21b in lucene-solr's branch 
refs/heads/branch_7_6 from [~ctargett]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=bbe9511 ]

SOLR-12746: Fix formatting for callout list numbers and preamble sections


> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
> Fix For: 7.6, master (8.0)
>
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-11-19 Thread Cassandra Targett (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692271#comment-16692271
 ] 

Cassandra Targett commented on SOLR-12746:
--

The change to remove dependence on the {{asciidoctor-html5s}} gem broke the 
ability for numbers in callout lists to be formatted properly. The most recent 
commit before this comment fixes those. 

I also noticed I had missed the formatting of "preamble" or "lead" sections on 
each page - it's supposed to be slightly larger. I added the new class/element 
structure to the CSS definitions and that fixed them.

> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
> Fix For: 7.6, master (8.0)
>
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-11-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692259#comment-16692259
 ] 

ASF subversion and git services commented on SOLR-12746:


Commit d37554e88ecfab2c274bbfd185856b8f737ea350 in lucene-solr's branch 
refs/heads/branch_7x from [~ctargett]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d37554e ]

SOLR-12746: Fix formatting for callout list numbers and preamble sections


> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
> Fix For: 7.6, master (8.0)
>
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-11-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692256#comment-16692256
 ] 

ASF subversion and git services commented on SOLR-12746:


Commit 4efaecac34159ee1b718c548530d20d80c1fb4bf in lucene-solr's branch 
refs/heads/master from [~ctargett]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4efaeca ]

SOLR-12746: Fix formatting for callout list numbers and preamble sections


> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
> Fix For: 7.6, master (8.0)
>
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-11-05 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675636#comment-16675636
 ] 

ASF subversion and git services commented on SOLR-12746:


Commit 2633e0e0cfc2278aa08ae1f066e02b681e7b7fee in lucene-solr's branch 
refs/heads/branch_7x from [~ctargett]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=2633e0e ]

SOLR-12746: Simplify the Ref Guide HTML structure and use semantic HTML tags 
where possible. Adds new template files for Asciidoctor HTML conversion.


> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
> Fix For: 7.6, master (8.0)
>
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-11-05 Thread Cassandra Targett (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675291#comment-16675291
 ] 

Cassandra Targett commented on SOLR-12746:
--

I'm going to let this sit a little bit on master before backporting to 
branch_7x. I want to make sure I have the changes to the Jenkins Ref Guide 
build script right before propagating any mistakes there.

> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
> Fix For: 7.6, master (8.0)
>
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-11-05 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675290#comment-16675290
 ] 

ASF subversion and git services commented on SOLR-12746:


Commit 1e3cc4861aafb832065ccba1d14c2c0280a449ba in lucene-solr's branch 
refs/heads/master from [~ctargett]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1e3cc48 ]

SOLR-12746: Simplify the Ref Guide HTML structure and use semantic HTML tags 
where possible. Adds new template files for Asciidoctor HTML conversion.


> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
> Fix For: 7.6, master (8.0)
>
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-10-30 Thread Cassandra Targett (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669132#comment-16669132
 ] 

Cassandra Targett commented on SOLR-12746:
--

Thanks again Alexandre, I'm glad to figure out those lines were causing the 
problem.

> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-10-30 Thread Alexandre Rafalovitch (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668938#comment-16668938
 ] 

Alexandre Rafalovitch commented on SOLR-12746:
--

I can build on master yes. And it now builds on the branch with those two 
dependencies commented out.

And the HTML looks so much cleaner now! And smaller.

There are still build errors for PDF and a deprecation warning for HTML stages, 
but both of these are on trunk as well, so not an issue here.

+1

> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-10-30 Thread Cassandra Targett (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668789#comment-16668789
 ] 

Cassandra Targett commented on SOLR-12746:
--

I figured out that we can remove the dependency on the {{asciidoctor-html5s}} 
gem by removing the 2 lines at the start of {{_templates/helpers.rb}} that 
require it. This seems to have no impact on the templates (but we can't remove 
the file entirely; the templates rely on some of what it does).

[~arafalov], you're able to build the Ref Guide HTML fine on master, right?

> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-10-30 Thread Cassandra Targett (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668682#comment-16668682
 ] 

Cassandra Targett commented on SOLR-12746:
--

I wrote up in the README in the branch that all you should need is the {{slim}} 
gem. You're running into a similar problem I had when I tried to integrate 
{{asciidoctor-html5s}} directly, so I did not integrate that project directly; 
I only copied the templates themselves.

However, I listed my own gems and realized that I've been running my tests with 
that gem installed; removing it causes the errors you probably saw about it 
being missing. I'll add it back in and take a look at what's happening and fix 
the README accordingly. Thanks for trying it out to find this.

> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-10-29 Thread Alexandre Rafalovitch (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668030#comment-16668030
 ] 

Alexandre Rafalovitch commented on SOLR-12746:
--

I tried to get this to work (ant clean build-site) but was not able to.

First, it was missing a gem, but I was able to progress by installing 
asciidoctor-html5s. Except it was refusing to install with normal command, I 
had to use:
{code:java}
sudo gem install asciidoctor-html5s --pre{code}
 

Then it was failing with conversion error on 
'field-type-definitions-and-properties.adoc' due to 'Could not find a converter 
to handle transform: empty'.

 
{code:java}
-build-site:
 [echo] Running Jekyll...
 [exec] Configuration file: 
/home/arafalov/Solr/solr-12746/solr/build/solr-ref-guide/content/_config.yml
 [exec] Source: /home/arafalov/Solr/solr-12746/solr/build/solr-ref-guide/content
 [exec] Destination: ../html-site
 [exec] Incremental build: disabled. Enable with --incremental
 [exec] Generating... Deprecation: The 'gems' configuration option has been 
renamed to 'plugins'. Please update your config file accordingly.
 [exec] 
 [exec] jekyll 3.5.0 | Error: Could not find a converter to handle transform: 
empty
 [exec] Conversion error: Jekyll::AsciiDoc::Converter encountered an error 
while converting 'field-type-definitions-and-properties.adoc':
 [exec] Could not find a converter to handle transform: empty
 [exec] Result: 1
build-site:
 [java] Exception in thread "main" java.lang.NullPointerException
 [java] at CheckLinksAndAnchors.main(CheckLinksAndAnchors.java:156)
{code}
 

 

I was not able to get past that. I am on Ubuntu. My gem list (gem list |grep 
"jekyll\|doctor\|slim"):

 
{noformat}
asciidoctor (1.5.6.2)
asciidoctor-html5s (0.1.0.beta.11)
jekyll (3.5.0)
jekyll-asciidoc (2.1.0)
jekyll-sass-converter (1.5.2)
jekyll-watch (2.1.2, 1.5.1)
slim (3.0.9) 
{noformat}
 

> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-10-29 Thread Cassandra Targett (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667686#comment-16667686
 ] 

Cassandra Targett commented on SOLR-12746:
--

I can't find anything else that needs to be fixed in this right now - perhaps 
will find things later, but there's nothing really glaring that's wrong. I'm 
going to commit this tomorrow if there aren't any comments or objections.

> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-10-22 Thread Cassandra Targett (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659870#comment-16659870
 ] 

Cassandra Targett commented on SOLR-12746:
--

I've now updated the {{jira/solr-12746}} branch to master as of last night, and 
added a couple more CSS fixes, added a license reference to NOTICE.txt [1], and 
updated the README and {{dev-tools/scripts/jenkins.build.sh}} scripts for the 
proper Slim version as mentioned in the earlier comment [2].

[~arafalov], I think you were interested in this issue last week?

I think this is ready to go. I'll check it out a bit more before committing - 
thoughts/reviews are welcome.

[1] - I'm not sure if I really needed to include the license for 3 reasons: 1) 
we aren't distributing the templates at all, just the output of the templates; 
2) I borrowed only the templates while the project they are from includes much 
more; and 3) I also modified the templates to make integration easier, so they 
aren't the same as the originals. Out of abundance of caution and respect for 
the original author I included a mention in NOTICE.txt anyway.

[2] - The need to define the Slim version is temporary. After I mentioned to 
the Asciidoctor project that I had the problem and that downgrading Slim fixed 
it, they were able to identify the Slim API changes in Slim's v4.0 release that 
caused the problem. Asciidoctor's future 1.5.8 release (which we'll consume in 
some way, eventually) will include the fix. This is the issue that has the fix: 
https://github.com/asciidoctor/asciidoctor/issues/2928. The error is harmless, 
just alarming, so if anyone is using Slim 4.x and sees the error, they can 
continue without any problems. Downgrading just allows us to avoid having to 
see it 30+ times for every HTML build.

> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-10-18 Thread Cassandra Targett (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655875#comment-16655875
 ] 

Cassandra Targett commented on SOLR-12746:
--

Based on feedback I got from the Asciidoctor community, this error I mentioned 
as the #2 caveat earlier:

bq. There is an error output by the Slim engine ({{Slim::Engine: Option 
:asciidoc is invalid}}) during the HTML build for every template (so, 30+ 
times).

is resolved by downgrading the Slim version to v3.0 instead of 4.0.1 which I'd 
installed as the latest since the templates didn't specify any specific 
version. I don't think we care about the Slim version, so I can update the 
README and Jenkins scripts to force this version and we can call that problem 
resolved.

I found some more CSS changes to fix & still a TODO or two I'd mentioned 
earlier, so I'll update the branch with these changes as soon as I can (next 
week).

> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-09-20 Thread Cassandra Targett (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622419#comment-16622419
 ] 

Cassandra Targett commented on SOLR-12746:
--

There is now a branch for this work, that is getting close to being ready to 
merge: 
https://git1-us-west.apache.org/repos/asf?p=lucene-solr.git;a=tree;h=refs/heads/jira/solr-12746;hb=refs/heads/jira/solr-12746

Some info:

# These changes require us to add a new {{_templates}} directory to direct 
Asciidoctor to use different selectors and classes when building the HTML. I 
started out with templates from https://github.com/jirutka/asciidoctor-html5s, 
but modified them in many ways to change their classnames to the ones we were 
already using to simplify the process of fixing our CSS files. 
** I have not yet dug into adding license info to Solr for use of these (or if 
I even need to since we aren't distributing the templates themselves), but the 
project uses the MIT license so it should be fine whatever we end up needing to 
do (TODO).
# The Liquid templates used by Jekyll are still there, and have been modified 
to use {{}} and {{}} tags instead of divs to identify the 
sections of the page that are content vs navigational elements.
# I tried to simplify some of the layers of divs, but there's possibly more 
that could be done. For example, for a paragraph there used to be about 6 
nested divs, like: {{column > post-content > main-content > sect1 > sectionbody 
> paragraph > p}}, but now it's closer to: {{column > content > sect1 > p}}.
# I threw in some other CSS changes for stuff that has been bugging me - 
specifically the padding of 2nd level bullets in the in-page TOC, and changing 
the 2nd level bullets to use an open circle instead of "-".

Caveats:

# The templates require that you have Slim installed locally in order to build 
the HTML. I've added instructions for this to {{solr-ref-guide/README.txt}} in 
the branch, but have not updated the Jenkins build script yet (TODO).
# There is an error output by the Slim engine ({{Slim::Engine: Option :asciidoc 
is invalid}}) during the HTML build for every template (so, 30+ times). I 
suspect it's related to a part of our Jekyll config that we have to have. There 
is supposedly some way to declare to Slim that it should ignore this, but I 
haven't yet been able to figure it out yet. I also asked about it on the 
Asciidoctor mailing list, but have not yet had a reply (TODO).

> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-09-05 Thread Cassandra Targett (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604966#comment-16604966
 ] 

Cassandra Targett commented on SOLR-12746:
--

And, a little bit more about how Readability works in FF - it's assumed Safari, 
etc., work similarly:

* Comment to a StackOverflow question about how FF's Reader View works: 
https://stackoverflow.com/questions/30661650/how-does-firefox-reader-view-operate/30688312#30688312
* The GitHub repo for FF's Readability algorithm: 
https://github.com/mozilla/readability

> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-09-05 Thread Cassandra Targett (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604963#comment-16604963
 ] 

Cassandra Targett commented on SOLR-12746:
--

There is also a sample library of custom backends provided by the Asciidoctor 
project: https://github.com/asciidoctor/asciidoctor-backends. However, the 
README says it only supports Slim v2.x (the project is now on v3.x), so these 
may be out of date. Many of the provided sample output formats have also been 
supplanted by other projects so it's possible these have not been kept up to 
date.

> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

2018-09-05 Thread Cassandra Targett (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604960#comment-16604960
 ] 

Cassandra Targett commented on SOLR-12746:
--

There is a project to add a new HTML5 backend (aka output format) for 
Asciidoctor: https://github.com/jirutka/asciidoctor-html5s/ that in theory 
would only require us to add a new Ruby gem to the requirements to build. I 
experimented with this briefly but was not able to get it to work properly with 
Jekyll and our build processes. I manually copied in the templates it uses and 
got an error that it did not know what to do with some of our content, so my 
assessment is that this project isn't yet ready for complex docs like ours.

> Ref Guide HTML output should adhere to more standard HTML5
> --
>
> Key: SOLR-12746
> URL: https://issues.apache.org/jira/browse/SOLR-12746
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org