[ 
https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett resolved SOLR-12746.
--------------------------------------
    Resolution: Done

Four consecutive Jenkins build runs were as expected, so I backported the 
change to 7x.

> Ref Guide HTML output should adhere to more standard HTML5
> ----------------------------------------------------------
>
>                 Key: SOLR-12746
>                 URL: https://issues.apache.org/jira/browse/SOLR-12746
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: documentation
>            Reporter: Cassandra Targett
>            Assignee: Cassandra Targett
>            Priority: Major
>             Fix For: 7.6, master (8.0)
>
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{<div>}} 
> tags to the content which break up our content into very small chunks. This 
> is acceptable to a casual website reader as far as it goes, but any Reader 
> view in a browser or another type of content extraction system that uses a 
> similar "readability" scoring algorithm is going to either miss a lot of 
> content or fail to display the page entirely.
> To see what I mean, take a page like 
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable 
> Reader View in your browser (I used Firefox; Steve Rowe told me offline 
> Safari would not even offer the option on the page for him). You will notice 
> a lot of missing content. It's almost like someone selected sentences at 
> random.
> Asciidoctor has a long-standing issue to provide a better more 
> semantic-oriented HTML5 output, but it has not been resolved yet: 
> https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by 
> providing your own in Slim, HAML, ERB or any other template language 
> supported by Tilt (none of which I know yet). There are some samples 
> available via the Asciidoctor project which we can borrow, but it's otherwise 
> unknown as of yet what parts of the output are causing the worst of the 
> problems. This issue is to explore how to fix it to improve this part of the 
> HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to