#36131: URLValidator does not allow URLs without a top level domain (except for
localhost)
-------------------------------+-----------------------------------------
     Reporter:  Ludwig Kraatz  |                    Owner:  Ludwig Kraatz
         Type:  Bug            |                   Status:  closed
    Component:  Core (Other)   |                  Version:  5.1
     Severity:  Normal         |               Resolution:  duplicate
     Keywords:  URL Validator  |             Triage Stage:  Unreviewed
    Has patch:  1              |      Needs documentation:  0
  Needs tests:  0              |  Patch needs improvement:  1
Easy pickings:  0              |                    UI/UX:  0
-------------------------------+-----------------------------------------
Changes (by Ludwig Kraatz):

 * easy:  1 => 0
 * type:  New feature => Bug

Comment:

 Thank you Sarah, for putting in the effort - and taking the time for this
 ticket.

 == Really a Bug || hope you reconsider || though i do see the point (to a
 degree)

 I do not consider this a new feature - I consider this **correcting a
 misleading/''outdated'' implementation** of URLValidator.
 I hoped to have pointed out, that how URLs are defined via RFC 3986 and
 the context of RFCs sourrounding it - is not the same as how the django
 URLValidator validates them.

 
https://github.com/django/django/blob/9cc3970eaaf603832c075618e61aea9ea430f719/docs/ref/validators.txt#L182
 Even consider how the django docs talk about the URLValidator.
 There stated, plain and simple ("in other words"): it does not validate
 URLs, but what django considers relevant and "URL-Looking".
 {{{
     A :class:`RegexValidator` subclass that ensures a value looks like a
 URL,
 ...
             Values starting with ``file:///`` will not pass validation
 even
 }}}

 I do understand the downstream implications. Which is why I added this
 ticket after my initial "pull request" (not talking about the quality of
 that request here..) - and the tests it broke.

 I do see how one could argue, that django operates mostly in the Web
 domain and as such the URLField validating those kind of URLs seems
 reasonable.
 But then again - adding a hardcoded 'localhost' really is making this
 argument a little obsolete. Because, try {{{https://localhost}}} on a
 machine that is not set-up to serve that resource.. It's the exact same as
 {{{https://printer}}}. If it is set-up - it works. if not, it does not.
 Only difference - it is easier for developers using django.
 Buut that makes it superficial - no offense. I made that argument before..
 I will elaborate on why "superficial naming" is a **bug**

 Just to make the inconsistency clear:
 {{{localhost}}} was reserved with RFC 6761.
 The mDNS, which django's URLValidator (as an EXAMPLE) 'violates', was RFC
 6762.
 Both from 02.2013.
 The URLValidator is from before that. Im pretty sure i used it back in
 2010/11.

 The field is not called a "DNSURLField" or "WebURLField" or
 "CustomURLField" - and as such, the implementation simply does not match
 what it claims to be.
 Which, in turn, leads to very circumventable problems.

 Oh my..
 
https://github.com/django/django/blob/9cc3970eaaf603832c075618e61aea9ea430f719/django/core/validators.py#L169

 i just realized django "actually" implemented a
 "CustomWeb2010"URLValidator - calling it URLValidator..

 I **really** want to "emphasize", that naming some "thing" plays a crucial
 role in what that thing should "do" or what that thing will be "expected
 to do". And as such - this plays a crucial role Quality-wise, especially
 for a "framework for perfectionists (with deadlines)" -- if thats still
 what django is labeled, as it was back when i started.

 == Why it matters || projects depend on django being 'reliable' || a
 concrete example

 The thing is, django is a framework, not a "simple project".

 Projects like Authentik depend on the conformity of things like the
 URLValidator in django:
 
https://github.com/goauthentik/authentik/blob/3daa39080a7866d83fad0fb3691e9e31397e0f6c/authentik/providers/saml/models.py#L43

 We use authentik in our intranet. We use it as a SAML IDP.
 It interacts with other Third-Party tools in our intranet. But - the
 intranet aspect is actually irrelevant here.
 The SAML Service Provider dictates its ACS-URL.

 As such, using the URLs in a way the URLValidator currently (falsefully)
 rejects, is mandatory for us.

 Its a Software that defines its ACS-URL as "https://example/resource";
 (other words, but thats the layout. BTW: do you see how the URL is
 accepted here as URL. I did not decalre it as some sort of link.)
 And that works just fine, because this URL is only used for validation,
 that a SAML request is meant for the endpoint it is handled at. No
 URL-"Calling" happens.
 Its a simple reference, in the form of an URL. A reference for a resource
 via its location.
 The location is locally-scoped, sure. But we live in 2025+. Not back in
 pre-2010, where the only things happening was in the WEB or FS. (stupid
 remark, i know. ''i am sorry''.)

 And as this software handles that all internally - it does work without
 anything to be setup on (m)DNS / Host or else.
 Its just another application running - as if a normal user would install
 something from some software-store (that just as well might run as
 "https://localhost";... but in this case, it simply doesn't)


 == Why it matters, even in standard configurations

 Again, RFC 6762 talks about mDNS -> which is a standard that gains (as
 part of a whole zeroconf thing) more and more relevance in everyday IT-
 usage.
 It is natively supported in MacOS, ships in the default of some Linux
 distros and just take Apple's Bonjur (as example mDNS implementation).

 file:/// - is a URL. it might not be a "WebURL" or "DNSURL", but - it is a
 **URL**.
 Developers expecting django to handle URLs might suffer headaches,
 because.. it simply "does not".
 And they might by completely caught off guard, because this DjangoURL-
 Field is so non URLy.

 Naming things falsefully, creating a misconception of what they do, is
 hindering us and others in simply utilizing a well organized
 infrastructure of software.
 Which i would say is the goal of open-source at its core..
 Which - brings me to, why this is a **bug**, **not a feature**.

 A software that so obviously does that - is **bug**gy.
 I came here, to fix that, or at least make sure i did my part - as good as
 i can.

 == Possible Proposition

 This is how i would do it. I get, if a {{{URLField(mode_feature='raw')}}}
 - is something you are leaning towards.
 I simply would not do it that way, as I see it as an evolutionary
 adaption&correction, not just a feature.

 1. Deprecation of URLField (as it never was, what it was called)
 2. creation of
   - {{{RawURLField + RawURLValidator}}}
   - {{{WebURLField + WebURLValidator // CustomURLField/Validator}}}
      - subclassing Raw*,
      - adding lazy-scheme, other restrictions and default stuff for
 backwards compatability

 (even though i would prefer something like
 {{{URL_Web__Field/URL_Raw__Field}}} - or even better
 {{{Field__URL_Web}}}/.., but thats totally irrelevant in a django
 context...)

 Benefits:
  - it is very clear (at least clearer) what one would get
  - one has to make a conscious decision about what the own usecase is
  - it would have to be a development-change that is to be made, instead of
 another option on the URLField/Validator, that is probably ignored more
 often than good.

 This way - sure - there are some hurdles when updating a projects django
 dependency.
 But thats to be expected in a changing environment or when implementation
 simply was not spot on.
 This way, at least everybody can decide consciously, with which to go for.

 And I do see, how you would want to include the forum on this topic - but
 i simply don't.
 I see a bug - i report it, i put in effort for people to see and
 understand it and i offer my help fixing it.
 But i dont "do people". Sry, but not sry. And most certainly no offense (:
 I have to look in the mirror from time to time - so, i'm no different then
 the rest of us.
 But, i don't lobby or discuss options - especially not with the django
 community.
 I tried that back then -  i was shut down, pretty similar to how this
 almost ended up. Just to see my suggestion being implemented after little
 time has past - because it was the only thing that made sense....
 It does not work "for/with me" - if "you" have to consult your community
 for "how you" want to approach this. I'm totally ok with that.

 I just brought a bug to your attention, and offered my help.
 As this ticket remains closed, so does my involvement. (: not unhappy -
 its just that my role seems no longer of need.

 == Additional remark about this FieldTest || Test Structuring suboptimal
 || Also requires some fixing IMHO

 The initial issue with my misconception of the test-issue, was based on
 the following situation:
 1. I changed the host_re, to allow for more versatile "localhost"
 variants. (roughly "localhost" <=> [a-z0-9-]{2-63})
 2. (besides other, understandbly failed tests) a test failed, because now
 {{{value='foo'}}} -> did break the existing tests.
 3. this was, because the lazy-scheme feature of the field kicked in,
 allowing for 'foo' to pass, because it was lazy interprated as
 "https://foo";

 => this is an **ISSUE**!
 I never worked close to the lazy-scheming feature, yet tests that are
 influenced by it fail.

 Usually - i would expect the URL-Validator to be tested rigorously, so
 that the URLFields additional features can be tested more cleanly - and
 focused.

 What i propose:
 The URLField "Validation" should be tested in its "accordance with
 URLValidator", except when it comes to the lazy scheme feature.
 Thats, where the URLField would allow for less strict validation.
 If it was handled this way, the issue in the FieldTest (point 2.) "would
 not have failed", because the URLValidator would have rejected it, same as
 the URLField without lazy scheme or the reverse - if passed with scheme,
 the URLValidator would have passed it, same as the URLField.

 As such - the "lazy scheming" feature would be tested on the field (where
 it originates), in a way that does not deliver false negatives as it
 "somehow did" in this situation (because URLValidator was being tested
 "VIA" URLField)
-- 
Ticket URL: <https://code.djangoproject.com/ticket/36131#comment:13>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/django-updates/010701949d3907e3-3c3b6df1-e158-4dc1-ba66-1b322075f94b-000000%40eu-central-1.amazonses.com.

Reply via email to