[The Java Posse] Zero-based vs. one-based indexing

Peter Becker Fri, 16 Apr 2010 17:15:17 -0700

[was: a new programming language]

I don't find either Reinier's or Dykstra's reasoning conclusive.


First Reinier's points:

RULE 1: Counting elements in a list is in the domain of the natural
numbers. Therefore, if negative numbers are needed the solution is
inferior.

Traditionally zero is not a natural number.

RULE 2: In a list of, say, 10 elements, it would be odd if '11' is
anything other than an Out-Of-Bounds number.

It isn't in either scheme.

The end index has to be to the RIGHT and not to the LEFT of the final
character. If it was to the LEFT, then you'd need negative numbers to
describe the empty set of a set with 1 element in it.

That is based on convenient implementation, not the perspective of theuser. It's a leaky abstraction, not UI/API design.

list.subList(0, 0) would then describe a list of size 1, whereas we
want one of size 0, so we'd have to write list.subList(0, -1). That's
awkward, so end indices have to work like they do in java.

How often would you write something like list.subList(0,-1). The onlycase I can think of are tests for the list structure. It might appear incalculations, but that seems not an issue to me.

List copy = list.subList(1, 11);

Now '11' shows up as a valid number in a 10-length list. That's rather
annoying, as it doesn't feel very natural for 11 to be a meaningful
count into a size 10 list.

But the point in the proposed semantics of the subList(..) parameters isthat the second is _behind_ the last element. Being one higher than thehighest position seems perfectly natural to me.


Now Dykstra's:

The observation that conventions a) and b) have the advantage that thedifference between the bounds as mentioned equals the length of thesubsequence is valid.

Again: implementation perspective. Most people (at least the ones I dealwith) are perfectly capable of figuring out how many numbers you have ifyou start with 11 and end with 15.

There is a smallest natural number. Exclusion of the lower bound —asin b) and d)— forces for a subsequence starting at the smallestnatural number the lower bound as mentioned into the realm of theunnatural numbers.

What is the smallest natural number? I studied math in Germany and thereI learned that there is a set N (let's not add too much TeX -- the Nshould have the double bar on the left), which is the sequence startingat one, then repeatedly adding one. The numbers including zero are N_0(subscript zero). That is a bit old school, and there are many differentconventions (see e.g. http://en.wikipedia.org/wiki/Natural_number). Butsaying "there is a smallest natural number" and implying it is zerohides the fact that conventions have been and continue to be different.

Adhering to convention a) yields, when starting with subscript 1, thesubscript range 1 ≤ /i/ < /N/+1; starting with 0, however, gives thenicer range 0 ≤ /i/ < /N/.

What makes the second range nicer? I miss an argument supporting thisstatement. Despite having used C, C++ and Java as my primary languagesfor something like 15 years I still feel the former is nicer.

My personal opinion is that there are valid reasons why it is convenientfor an implementation of array structures to use the convention Dykstraproposes as (a). You basically take the perspective of pointers (oddlyenough not an argument Reinier or Dykstra made). The first position isthe array start + 0, the last position is array start + (n-1), arraystart plus n is to the right of the last. Makes perfect sense to me.

Does that justify designing APIs in higher level languages that way? Idon't really think so, I think other criteria should be applied. Themost important one from my perspective would be the question "what isthe programmer using the API most likely to expect?". The answer to thatone depends on your target audience: if you have low-level coders orpeople trained in a C-style language they will feel comfortable with the0-based index. I believe pretty much anyone else (including mostmathematicians) tend to prefer 1-based indexing.

Sometimes I think a programming language should just have a genericarray a.k.a. map and optimize for integer ranges. Then you could say:


  var zeroBasedArray = new Map([0,5], String);
  var oneBasedArray = new Map([1,6], String);

And the rest is up to the compiler to optimize. I would expect both datastructures to be represented internally in the same way.

Of course apart from requiring integer ranges as types (quite doable) itleads to the problem that you can have a mix of both styles in oneproject, which might create a mess (I have no good answer to that). Andthe syntax above is too verbose, but that is a separate issue.


  Peter


On 17/04/10 04:51, B Smith-Mannschott wrote:

On Fri, Apr 16, 2010 at 18:40, Kevin Wright
<[email protected]>  wrote:

You've now gone and spoilt a perfectly good nonsense thread with some
(admittedly obvious) logic and reason!
shame on you...

And here's more logic and reason, though I prefer to index using
integer multiples of PI ;-)

http://userweb.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html

Why numbering should start at zero
==================================

To denote the subsequence of natural numbers 2, 3, ..., 12 without the
pernicious three dots, four conventions are open to us

a) 2 ≤ i<  13
b) 1<  i ≤ 12
c) 2 ≤ i ≤ 12
d) 1<  i<  13

Are there reasons to prefer one convention to the other? Yes, there
are. The observation that conventions a) and b) have the advantage
that the difference between the bounds as mentioned equals the length
of the subsequence is valid. So is the observation that, as a
consequence, in either convention two subsequences are adjacent means
that the upper bound of the one equals the lower bound of the other.
Valid as these observations are, they don't enable us to choose
between a) and b); so let us start afresh.

[... follow the link for the rest ...]

http://userweb.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html


--
You received this message because you are subscribed to the Google Groups "The Java 
Posse" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/javaposse?hl=en.

[The Java Posse] Zero-based vs. one-based indexing

Reply via email to