Re: [The Java Posse] Re: Zero-based vs. one-based indexing

Peter Becker Sat, 17 Apr 2010 03:57:09 -0700

Reinier,

you very conveniently pick pieces. Let me show you inline...


On 17/04/10 13:16, Reinier Zwitserloot wrote:

replies inline.

On Apr 17, 2:15 am, Peter Becker<[email protected]>  wrote:

Traditionally zero is not a natural number.

Who cares about natural numbers?

You wrote:

RULE 1: Counting elements in a list is in the domain of the natural
numbers. Therefore, if negative numbers are needed the solution is
inferior.

This is what I was replying to, only that you removed that part.

This is about the counting numbers,
and zero is undisputably a counting number.

Funny, the first three hits for "counting number" on Google are for me:

http://mathworld.wolfram.com/CountingNumber.html
http://wiki.answers.com/Q/What_is_a_counting_number
http://www.mathsisfun.com/whole-numbers.html

Wolfram (admittedly the best of those sources) says that " However, zero(0) is sometimes also included in the list of counting numbers.", theother two exclude zero. Not exactly what I call "undispatable". Butlet's not go there.

After all, if you're
counting and there's nothing there, you need some name for this
concept as well as a notation. The name is 'zero' and the notation is
'0' - at least, that seems the most sensible thing to pick. Given the
axiomatic need to represent the empty range somehow, there's no
dodging the need for the 0 concept.

Who's trying to dodge the zero concept? I just don't think it is commonto start counting from zero. Feasible: yes. Common: no.

RULE 2: In a list of, say, 10 elements, it would be odd if '11' is
anything other than an Out-Of-Bounds number.

It isn't in either scheme.

But of course it is; in the indices start at 1 system, to make a copy
of a list you'd have to write list.subList(1, 11); and yet this won't
cause an IndexOutOfBoundsException.

No, it is not. The second parameter is the index behind the last, thusnot necessary part of the index range.

One could of course switch to a
system where the end is inclusive and not exclusive like java's. In
this case you'd end up with list.subList(1, 10), but now you get into
all the points dijkstra raised: The difference between the indices is
offset by 1 from the length of the sublist, and the thing I mentioned:
To represent an empty list you'd now need to write: list.subList(1,
0), which is weird. I'm going to make that axiomatic, by the way: I'm
taking it as natural and given that subList(1, 0) looks awkward. If
this is not axiomatic for you I'm not sure there's a point arguing, as
that would come down to taste. We could however spew facts about the
size of the population that thinks subList(1, 0) is acceptable, of
course.

I agree that his is about taste. I still think that subList(i,i) israther odd and unexpected unless you grew up in C world. But I don'thave any stats on population sizes either.

What is the smallest natural number? I studied math in Germany and there
I learned that there is a set N (let's not add too much TeX -- the N
should have the double bar on the left), which is the sequence starting
at one, then repeatedly adding one. The numbers including zero are N_0
(subscript zero). That is a bit old school, and there are many different
conventions (see e.g.http://en.wikipedia.org/wiki/Natural_number). But
saying "there is a smallest natural number" and implying it is zero
hides the fact that conventions have been and continue to be different.

Dijkstra's point WASNT that the smallest natural number is zero. His
point instead was that the number BELOW whatever one picks as
'smallest natural number' will be required in most of the schemes that
aren't java. So, if you think 0 is ugly, then you could start at
offset 1, but you then also have to choose that end indices are
exclusive. Because if you don't, to describe the empty set you need to
write subList(1, 0) - and that 0 is one less than the lowest natural
number, and therefore, 'unnatural'.

What is the problem with an "unnatural" number?

The same argument applies if one
DOES consider 0 as a natural number - if you do the end indicing as
inclusive you'd have to write subList(0, -1), which also includes an
unnatural number. This part of dijkstra's argumentation attempts to
prove that the only proper way to designate a range is for the start
index to be inclusive, and the end index to be exclusive.

For which you have to assume that this number outside the range issomehow unwanted. Matter of taste again.

You've also mentioned a few times that you don't consider the benefits
raised so far as relevant, but, let's flip it around then: What does
counting from 1 get you?

A common behaviour with other scenarios. AFAIK not a single spreadsheetapplication starts row or column count at 0. And if I do the thoughtexperiment of standing at central station asking all bypassers to numbera list of something (let's say words), I somehow think that the numberof people labeling starting with 1 would be much higher than the numberof those starting with 0. You even talk about the "first element of anarray", not the "zeroest".


Do you think everyone should change their ways and start counting at zero?

At least counting from 0 gets you the ability
to describe the smallest and largest sublists in a way that is
consistent and doesn't have to dip into unnatural numbers on either
end.

Actually, by having the second parameter of subList(..) exclusive youlose one addressable element. If you have an array that spans the wholeaddressable range, then you can't clone the array using the subList(..)method. Not that this would be common, but neither is creating emptyarrays using subList(..).

Adhering to convention a) yields, when starting with subscript 1, the
subscript range 1 ≤ /i/<  /N/+1; starting with 0, however, gives the
nicer range 0 ≤ /i/<  /N/.

What makes the second range nicer? I miss an argument supporting this
statement.

The range 0<  i<  N includes only numbers that fit within the set's
own counting numbers, whereas 1<  N + 1 contains a number that isn't a
counting number of the set (N + 1 itself).

I don't get this at all. Either you start counting at 0 (consistent withyour index), in which case N is not a number within the countingnumbers. Or you count from 1, in which case 0 isn't. Your index has Nvalues, the range 0..N has N+1.

My personal opinion is that there are valid reasons why it is convenient
for an implementation of array structures to use the convention Dykstra
proposes as (a). You basically take the perspective of pointers (oddly
enough not an argument Reinier or Dykstra made).

Of course not. Implementation is utterly irrelevant here. It would be
completely trivial for C to consider a[1] as simply dereferencing 'a'
as a pointer. In java in particular the same logic can be made. I'm
speaking now from the perspective of the compiler itself. If you'd
like to claim that it's easier for library writers (who are after all
'users' of the programming language) to work with 0 offsets then that
is of course a fine argument for why 0 offsetting is the only right
answer.

I claim it is easier to use 0-based indexing if you use a pointer mindset.

I believe pretty much anyone else (including most
mathematicians) tend to prefer 1-based indexing.

As dijkstra tried to highlight with his Mesa example, in practice 0-
based indicing is more consistent.

His example seems limited to the statement " Extensive experience withMesa has shown that the use of the other three conventions has been aconstant source of clumsiness and mistakes" -- without any moreinformation on what Mesa exactly did and how that caused problems it ispretty hard to judge.

It's a lesson you need to grok, but
once you do so it's all consistent. With 1-offsetting you lose some of
this, as for example a location offset has to be 0 based to make any
sense, and thus you always have to think: Is this 1-offset or 0-
offset? In other words, 0-offset is unavoidable, but 1-offset is. By
avoiding 1-offsets, all offsetting is 0 based and even if this feels
less natural for a beginning programmer its easier for everyone that's
been programming for longer than 7 days. As most programmers spend
more than 14 days programming, 0 offsetting wins. I freely admit I
pulled the number '7 days' out of you know where, but you get the
point, I presume.

How is indexing and offsetting the same? We are still assuming we arenot arguing in pointer arithmetics, aren't we?


  Peter

--
You received this message because you are subscribed to the Google Groups "The Java 
Posse" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/javaposse?hl=en.

Re: [The Java Posse] Re: Zero-based vs. one-based indexing

Reply via email to