On 2014-05-29 05:36, Kevin Ballard wrote:
[--snip--]
> And when dealing with a sequence in a precise encoding, the natural unit to 
> work
> with is the code unit (and this has precedence in other languages,
such as JavaScript, Obj-C, and Go).
> 

JavaScript:

  $ node
  > var s = "hï"; // Note the accent
  undefined
  > s.length;
  2

Rust:

  $ cat
  fn main() {
    let l = "hï".len();     // Note the accent
    println!("{:u}", l);
  }
  $ rustc hello.rs
  $ ./hello
  3

No matter how defective the notion of "length" may be, personally I
think that people will expect the former, but will be very surprised by
the latter. There are certainly cases where the JavaScript version is
wrong, but I conjecture that it "works" for the vast majority of cases
that people and programs are likely to encounter.

IMO expecting people to read docs is a poor substitute for being
explicit in a method name about what the method does, especially when it
costs only 5 characters. The Principle of Least Astonishment and all that.

As a rule people don't read docs until they've encountered a "bug" in
their expectations vs. what the language/library actually does -- at
which point they're already annoyed and don't need to be further annoyed
by the realization that "it does something completely non-intuitive"
(for their perspective).

Thankfully the programming world has become more aware of i18n issues,
but for people who still predominantly use ASCII such bugs may lay
dormant for a long time before anyone discovers them.

Just my €0.02.

Regards,


_______________________________________________
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to