Re: [idn] Re: character tables

John C Klensin Wed, 02 Mar 2005 07:48:29 -0800

Erik,

A few observations...


        (1) First, a registry does have the right to require
        that registrants observe particular rules and conditions
        in subdomains they delegate and to pass those rules down
        the tree.  Whether that is wise or sensible is another
        issue, and enforceability is yet another question.
        But, unless national law prevents it, RFC 1591, to which
        all TLD registries more or less agreed, rather
        explicitly provides for passing the responsibilities to
        the community down the tree.  Even ignoring troublesome
        concepts like "require" and "enforce", certainly nothing
        prevents registries from educating and persuading
        registrants about how they should behave.
        
        (2) In my regular role as a luser, I really like fast,
        easily-used, small-footprint browsers.  I'm more
        security-conscious and suspicious than the user average,
        and therefore also like handy tools to help me dissect
        and verify things that might look suspicious.  Tying up
        a browser with heuristics, such as mixed-script
        detectors, that may not work well and have a large
        footprint, doesn't impress me as a good tradeoff.  For
        better or worse, the assumption of a decade ago that
        most criminals, especially most electronic criminals,
        were stupid is no longer applicable, if ever it was.
        That implies, I think, that if we design a simple test
        that blocks some look-alike cases but permits other,
        more subtle, ones, we will simply drive the phishers to
        better understand and use the subtle stuff: not a good
        tradeoff.

        (3) As far as surfing around the world is concerned,
        we've got a situation today in which the domain name
        associated with a particular URL does not really predict
        the content to be found on that page.  That will
        undoubtedly get worse, as more folks discover that the
        intersection of domain and host administration with web
        site organization often makes it much easier to maintain
        versions of pages in multiple languages in the same,
        rather than different, DNS trees.  So, since I don't
        read Chinese, I'm unlikely to frequently seek out pages
        whose content is in Chinese.  But I frequently find
        pages I can read via URLs that contain elements written
        in pinyin.  I fully expect those elements, and some of
        the subdomain names, will shift to Chinese characters as
        IDNs and IRIs are more widely available.   I also expect
        that transition will make things more comfortable for
        someone who reads Chinese and would prefer to not deal
        with Latin characters and harder for me, but that is a
        reasonable tradeoff over which none of us will have much
        influence.
        
        (4) We need to get unstuck from thinking about this
        purely as a browser problem.   The usual phishing attack
        involves an email message containing a link.  For those
        email clients that don't immediately invoke a full
        browser as soon as a link appears --and many of those
        links occur in plain-text, not HTML, email-- they are
        invoking the browser when the link is clicked on.  The
        situation in the browser is then different, since none
        of the "hover over link", "look at status bar", etc.,
        tools are going to apply, or, at least, are not going to
        work in the ways that some of these discussions suggest
        for links that appear on web pages that are already open
        in the browser.  Now, we have given MUA writers no
        advice about what they should pass to the browser if
        they see an IRI or otherwise-encoded string that
        contains an IDN.  If they pass the IRI/
        native-script-form IDN, they risk passing it to a
        browser version that doesn't have a clue.  So maybe they
        force the thing into URI/ punycode form and pass that.
        Now, do you really want the browser to look at the
        thing, perform ToUnicode on the name (which, of course,
        may yield something other than what the user saw),
        perform some tests, and then pop up a "you just passed
        me an IDN that looks suspicious, do you really want to
        open that page?" box.  I think probably not.   Moreover,
        I think that, if you do, there would quickly be a
        sufficient number of false positives (positive for bad
        stuff) to get users really used to clicking "yes"
        without thinking... and cursing the browser implementer
        for bothering them with a pointless warning.

So my conclusion is that we need a mixed
protocol-registry-browser strategy.  That strategy, IMO, should
shifted the processing burdens as much as possible to the first
two.  And I think that notions that the problem can or should be
solved in any of those three places alone are probably misguided.

     john






--On Tuesday, 01 March, 2005 20:47 -0800 Erik van der Poel
<[EMAIL PROTECTED]> wrote:

>> However, I note that this particular conversation is between
>> a browser  developer (Gervase) and one of the IDNA authors
>> (Paul), neither of which  is a registry representative, so
>> why exactly are you 2 having this  conversation? :-)
>> 
>> Sorry, I'm half joking. Half, because you two have every
>> right to  discuss whatever you wish. The other half because I
>> believe browser  developers can afford to focus more on their
>> end of things.
> 
> Sorry, I've been told that this half-joking thing was
> confusing, and I now believe I shouldn't have tried to be so
> cute.
> 
> All I'm trying to say to *Gervase* is that it doesn't really
> matter *what* characters are allowed to be registered in a
> registry, as long as the browser takes steps to warn the user
> when something phishy might be going on, e.g. a slash
> homograph, or a Cyrillic small 'a' when the user was probably
> expecting a Latin small 'a'. As I have pointed out, the
> registry does *not* have control over higher-numbered level
> domains. E.g. .de controls the 2nd level domain (2LD), but not
> the 3LD, 4LD and so on. That is where the slash homograph
> problem *really* matters.
> 
>> Instead, I wish the browser developers would 
>> focus more on the *user*, who may be "surfing" from one site
>> to the  next, spanning the globe, and crossing language
>> boundaries.
> 
> Sorry, this may not have been the best logic to use in my
> argument. It would have been better to talk about phishers,
> who often spam users with email containing URIs that *could*
> contain IDN labels with dangerous homographs at any level of
> the name, 2LD, 3LD, or whatever.
> 
> (Most users *don't* surf around the world, since many are
> monolingual or maybe bilingual.)
> 
> Anyway, help me out, guys and gals. Pull my logic through the
> wringer, and comb it with the finest comb you have at your
> disposal. This way, we can collectively improve our
> understanding of the IDN phishing problem and ways to address
> it.
> 
> Erik

Re: [idn] Re: character tables

Reply via email to