#35533: urlize() makes a bit of a mess of links embedded in Markdown
--------------------------------+--------------------------------------
     Reporter:  Simon Willison  |                    Owner:  nobody
         Type:  Bug             |                   Status:  new
    Component:  Utilities       |                  Version:  5.1
     Severity:  Normal          |               Resolution:
     Keywords:                  |             Triage Stage:  Unreviewed
    Has patch:  0               |      Needs documentation:  0
  Needs tests:  0               |  Patch needs improvement:  0
Easy pickings:  0               |                    UI/UX:  0
--------------------------------+--------------------------------------
Comment (by Simon Willison):

 The ideal output for this example would be:
 {{{
 Annotated versions of talks I have given, with extensive notes and
 additional links. Here's [how I make these](<a
 href="https://simonwillison.net/2023/Aug/6/annotated-presentations/";
 rel="nofollow">https://simonwillison.net/2023/Aug/6/annotated-
 presentations/</a>).
 }}}
 Alternatively, not URLizing at all would be preferable to URLizing in a
 way that produces broken links.

 I think what's happening here is the logic that looks for non-protocol
 links that include or end in .net may be kicking in, and deciding that the
 following is the URL that should be linked:
 {{{
 these](https://simonwillison.net/2023/Aug/6/annotated-presentations/)
 }}}
 It's hard to suggest a fix for this. Ideally the code would "notice" that
 `these](https://simonwillison.net/2023` is not a valid URL, but instead
 the logic is deciding that it's probably valid but should have `http://`
 glued on the start.

 Maybe we could have code that notices that
 `http://these](https://simonwillison.net/2023/Aug/6/annotated-
 presentations/)` is NOT a valid URL - you cannot have `](` in the middle
 of the hostname portion - and hence decides not to URLize it at all.

 (Background: the reason I'm seeing this is that my Django SQL Dashboard
 software tries to URLize text it displays, but has no way of knowing if a
 database column contains Markdown - this broken example came from
 https://simonwillison.net/dashboard/tags-with-descriptions/ )
-- 
Ticket URL: <https://code.djangoproject.com/ticket/35533#comment:1>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/010701902c5c463b-9a369abe-57d4-40bd-af30-7943a52ddb33-000000%40eu-central-1.amazonses.com.

Reply via email to