[ 
https://issues.apache.org/jira/browse/YETUS-457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15472158#comment-15472158
 ] 

Allen Wittenauer commented on YETUS-457:
----------------------------------------

bq. It seems like a bug that the JIRA ID to form the URL is going through 
sanitize_text.  If we are worried about special characters in JIRA IDs

The only special character that's going to escaped in the JIRA URL is going to 
be the hyphen.  There's literally zero reason to escape hyphen.  The only time 
multimarkdown recognizes it is for the beginning of an unordered list.  The 
risk of not escaping it is insignificantly low.  The places where we might want 
to consider escaping it, it's pretty much a feature to *not* escape it because 
it is almost certainly being used for exactly that.

bq. Plus the advantages if it helps support other MD parsers.

There are literally 10s if not hundreds of different markdown processors with 
their own quirks.  We can't support them all.  RDM spits out multimarkdown.  If 
those parsers don't support multimarkdown, then they're already in trouble.  
Never mind bugs, etc, in those processors.  We're already dealing with issues 
in just one!

bq. I think we should also still escape the HTML entities, since there could be 
a JIRA with a summary like "Add missing <i> tag", which would be picked up as 
inline HTML and not auto-escaped by Markdown. We'd want this displayed 
literally though.

Take a look at  HDFS-9220 in:
* 
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/site/markdown/release/2.7.2/CHANGES.2.7.2.md
* 
http://hadoop.apache.org/docs/r3.0.0-alpha1/hadoop-project-dist/hadoop-common/release/2.7.2/CHANGES.2.7.2.html

It has a < tag that is already escaped with a slash and passed through mvn 
site, github, and gitlab properly.  It is being rendered as a &lt; in all three 
cases.

So no, I'm pretty much not convinced that being aggressive here is a good idea. 
 In fact, when I compare it to the most common use cases of the output, we're 
actually degrading the quality significantly.  The original list covers the 
vast majority of real world use cases. It sounds like we just need to add 
single quote, square brackets, and dollar signs to cover pandoc and we're good 
to go.



> RDM does not properly escape entities
> -------------------------------------
>
>                 Key: YETUS-457
>                 URL: https://issues.apache.org/jira/browse/YETUS-457
>             Project: Yetus
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>            Priority: Critical
>         Attachments: YETUS-457.001.patch, YETUS-457.002.patch
>
>
> Noticed while browsing the Hadoop 3.0.0-alpha1 changelog. Quotes and possibly 
> some other entities are not escaped properly, leading to malformed markdown 
> output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to