[ 
https://issues.apache.org/jira/browse/SOLR-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-10296:
-------------------------------------
    Description: 
We have developed several tools and scripts for converting the Ref Guide out of 
Confluence which get us most of the way to a fully converted set of pages. 
However, we already know that there are several issues that could not be 
automated.

>From https://github.com/ctargett/refguide-asciidoc-poc/issues/27, we have this 
>list:

* The conversion process will insert TODOs for several items that we thought 
might be problematic during conversion; these need to be reviewed and resolved. 
Some of these items are also covered in the below topics.
* Block elements in tables. The current version of the PDF creation tool we are 
using does not handle those properly (see 
https://github.com/ctargett/refguide-asciidoc-poc/issues/13). In some cases, we 
should remove the table entirely and present the content in a new way (using, 
most often, [labled 
lists|http://asciidoctor.org/docs/user-manual/#labeled-list] instead).
* Review and (usually) remove huge Tables of Contents from the top of pages. 
The current design of the online version will automatically create a TOC for 
the page, we don't need another one and in some cases this TOC was hand-created 
so can't be removed via conversion.
* Non-image attachments. Some SVG files will be converted to images, but they 
should not be treated as images.
* Failed link conversions. Despite my best attempts, many dummy URLs are 
treated by Confluence as real URLs (meaning, dummy URLs like 
{{http://<host>:<port>/solr}} are coded in Confluence's XHTML with <a> tags). 
These will be converted as URLs but will throw errors during the conversion 
process. In some cases, the URLs aren't just these example URLs but are 
indicative of a real problem that needs to be resolved.
* Spurious <br/> tags. Some API pages have a list of available calls structured 
as a list but without being a real ordered or unordered list. These will 
convert badly. The issue 
https://github.com/ctargett/refguide-asciidoc-poc/issues/31 has a list of pages 
where this might be a problem.
* Appropriate Lead Paragraphs. The stylesheet for HTML pages will make the 
first paragraph of every HTML page a slightly larger font, by way of 
introduction. In many cases, the first paragraph is not really ready for that 
sort of treatment and should be revised to be a more succinct introduction to 
the feature or further contents of the page.

More problems may be added to this issue as items that specifically need to be 
cleaned up.

  was:
We have developed several tools and scripts for converting the Ref Guide out of 
Confluence which get us most of the way to a fully converted set of pages. 
However, we already know that there are several issues that could not be 
automated.

>From https://github.com/ctargett/refguide-asciidoc-poc/issues/27, we have this 
>list:

* The conversion process will insert TODOs for several items that we thought 
might be problematic during conversion; these need to be reviewed and resolved. 
Some of these items are also covered in the below topics.
* Block elements in tables. The current version of the PDF creation tool we are 
using does not handle those properly (see 
https://github.com/ctargett/refguide-asciidoc-poc/issues/13). In some cases, we 
should remove the table entirely and present the content in a new way (using, 
most often, [labled 
lists|http://asciidoctor.org/docs/user-manual/#labeled-list] instead).
* Review and (usually) remove huge Tables of Contents from the top of pages. 
The current design of the online version will automatically create a TOC for 
the page, we don't need another one and in some cases this TOC was hand-created 
so can't be removed via conversion.
* Non-image attachments. Some SVG files will be converted to images, but they 
should not be treated as images.
* Failed link conversions. Despite my best attempts, many dummy URLs are 
treated by Confluence as real URLs (meaning, dummy URLs like 
{{http://<host>:<port>/solr}} are coded in Confluence's XHTML with <a> tags). 
These will be converted as URLs but will throw errors during the conversion 
process. In some cases, the URLs aren't just these example URLs but are 
indicative of a real problem that needs to be resolved.
* Spurious <br/> tags. Some API pages have a list of available calls structured 
as a list but without being a real ordered or unordered list. These will 
convert badly. The issue 
https://github.com/ctargett/refguide-asciidoc-poc/issues/31 has a list of pages 
where this might be a problem.

More problems may be added to this issue as items that specifically need to be 
cleaned up.


> Convert existing Ref Guide and post-conversion cleanup
> ------------------------------------------------------
>
>                 Key: SOLR-10296
>                 URL: https://issues.apache.org/jira/browse/SOLR-10296
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: documentation
>            Reporter: Cassandra Targett
>
> We have developed several tools and scripts for converting the Ref Guide out 
> of Confluence which get us most of the way to a fully converted set of pages. 
> However, we already know that there are several issues that could not be 
> automated.
> From https://github.com/ctargett/refguide-asciidoc-poc/issues/27, we have 
> this list:
> * The conversion process will insert TODOs for several items that we thought 
> might be problematic during conversion; these need to be reviewed and 
> resolved. Some of these items are also covered in the below topics.
> * Block elements in tables. The current version of the PDF creation tool we 
> are using does not handle those properly (see 
> https://github.com/ctargett/refguide-asciidoc-poc/issues/13). In some cases, 
> we should remove the table entirely and present the content in a new way 
> (using, most often, [labled 
> lists|http://asciidoctor.org/docs/user-manual/#labeled-list] instead).
> * Review and (usually) remove huge Tables of Contents from the top of pages. 
> The current design of the online version will automatically create a TOC for 
> the page, we don't need another one and in some cases this TOC was 
> hand-created so can't be removed via conversion.
> * Non-image attachments. Some SVG files will be converted to images, but they 
> should not be treated as images.
> * Failed link conversions. Despite my best attempts, many dummy URLs are 
> treated by Confluence as real URLs (meaning, dummy URLs like 
> {{http://<host>:<port>/solr}} are coded in Confluence's XHTML with <a> tags). 
> These will be converted as URLs but will throw errors during the conversion 
> process. In some cases, the URLs aren't just these example URLs but are 
> indicative of a real problem that needs to be resolved.
> * Spurious <br/> tags. Some API pages have a list of available calls 
> structured as a list but without being a real ordered or unordered list. 
> These will convert badly. The issue 
> https://github.com/ctargett/refguide-asciidoc-poc/issues/31 has a list of 
> pages where this might be a problem.
> * Appropriate Lead Paragraphs. The stylesheet for HTML pages will make the 
> first paragraph of every HTML page a slightly larger font, by way of 
> introduction. In many cases, the first paragraph is not really ready for that 
> sort of treatment and should be revised to be a more succinct introduction to 
> the feature or further contents of the page.
> More problems may be added to this issue as items that specifically need to 
> be cleaned up.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to