Re: Array start index

2017-02-08 Thread Bastiaan Veelo via Digitalmars-d-learn
Wrapup: I am going to go for the original approach of index 
conversion, and leaving the offset-pointer approach for what it 
is.


Reasons:
1) uncertain efficiency gain/loss,
2) theoretically it may fail,
3) .sizeof does not include the payload,
4) analysis of the assembler generated by our reference compiler 
(Prospero) shows that it also uses index conversion. 
(Interestingly, it does so at compile time, when possible).


Thanks for many interesting insights.

Bastiaan.


Re: Array start index

2017-02-07 Thread Ali Çehreli via Digitalmars-d-learn

On 02/07/2017 02:04 PM, Bastiaan Veelo wrote:

>> This optimization cannot work if the array is a static array inside
>> the same struct. It would work with a dynamic array but then it would
>> probably be slower than applying the *(ptr+index) technique.
>
> You mean slower than the _payload[index - first] technique? Is that
> because of heap memory versus stack memory?

Mandatory disclaimer: We can't be sure without testing.

Not exactly because stack versus heap because the whole object might be 
sitting in heap memory anyway:


alias S = StaticArray!(/* ... */);

S[] objects;
objects ~= S(/* ... */);

So, all of those are on the heap.

It's more about having everything at hand, near each other, close in 
memory. If the member is a static array, then when we have an S[], all 
members are there to be operated on. If the array in dynamic, then an 
S[] has indirections, reaching out to the heap, which may involve 
long-latency memory reads.


foreach (s; objects) {
s[42];// May have a cache miss and read from memory
}

In the static array case, the entire body of s is already on the cache.

On the other hand, when the objects are large, then few of those can fit 
in the cache. etc. One needs to profile to see what works better.


> Am I correct that the reason that a dynamic array would work is because
> it is allocated independently from the struct, meaning that its address
> stays the same even if the struct is moved?

Yes, that would be a requirement to prevent the self-reference.

> We could .reserve a dynamic
> array in initialize() to prevent it being resized.

Makes sense.

Another solution that came to my mind: You can keep the self-referencing 
solution but make sure that the objects are not moved after initialize():


S[] objects;
objects.reserve(100);
objects.each!(o => o.initialize());
// The objects will not move if you're careful

> Thanks again.
> Bastiaan.

Ali



Re: Array start index

2017-02-07 Thread Bastiaan Veelo via Digitalmars-d-learn

On Tuesday, 7 February 2017 at 20:33:35 UTC, Ali Çehreli wrote:

On 02/07/2017 02:11 AM, Bastiaan Veelo wrote:

> We do not need to take measures against the GC?

Of course we have to and the struct that I wrote is illegal 
because it is self-referencing through the _ptr member. (D has 
the right to move structs around, making my _ptr potentially 
pointing to an illegal address.)


Gosh am I glad I brought that up, and for people like you hanging 
out in the learn group. Thanks for pointing this out!


This optimization cannot work if the array is a static array 
inside the same struct. It would work with a dynamic array but 
then it would probably be slower than applying the *(ptr+index) 
technique.


You mean slower than the _payload[index - first] technique? Is 
that because of heap memory versus stack memory?


Am I correct that the reason that a dynamic array would work is 
because it is allocated independently from the struct, meaning 
that its address stays the same even if the struct is moved? We 
could .reserve a dynamic array in initialize() to prevent it 
being resized.


Thanks again.
Bastiaan.


Re: Array start index

2017-02-07 Thread Bastiaan Veelo via Digitalmars-d-learn

On Tuesday, 7 February 2017 at 20:28:30 UTC, Ali Çehreli wrote:

You forgot to call that most important function. ;)


Hah of course. I assumed the name would give it some special 
meaning, like postblit.


1) I don't understand the first assert there, which does not 
pass for me, so I commented it out.


The intention is to check that (_payload.ptr - first) is larger 
than 0.


2) May bad: init() is not a good name for a struct member, so 
it should be renamed:


void initialize() {
// assert( first < cast(size_t)_payload.ptr);
   // Address space underrun.
assert(-first < size_t.max - cast(size_t)_payload.ptr); 
// Address space overrun.

this._ptr = _payload.ptr - first;
}

3) Instead of having to remember to call it, let's introduce a 
function that does it for us:


auto makeStaticArray(T, ptrdiff_t first, ptrdiff_t last)() {
auto s = StaticArray!(T, first, last)();
s.initialize();
return s;
}


OK good.


unittest {
// StaticArray!(int, -10, 10) arr;
auto arr = makeStaticArray!(int, -10, 10);

> foreach (i, ref e; arr)
> e = i;

Unrelated: That line passes because you're building 32-bits. 
Here is the error I got:


  Error: cannot implicitly convert expression (i) of type long 
to int


You can cast it:

e = cast(int)i;

or by

import std.conv : to;
e = i.to!int;


Thanks a lot for your illustrative answers, including the next 
one!


Bastiaan.


Re: Array start index

2017-02-07 Thread Ali Çehreli via Digitalmars-d-learn

Pressed send too soon, before considering your GC question.

On 02/07/2017 02:11 AM, Bastiaan Veelo wrote:

> We do not need to take measures against the GC?

Of course we have to and the struct that I wrote is illegal because it 
is self-referencing through the _ptr member. (D has the right to move 
structs around, making my _ptr potentially pointing to an illegal address.)


This optimization cannot work if the array is a static array inside the 
same struct. It would work with a dynamic array but then it would 
probably be slower than applying the *(ptr+index) technique.


Ali



Re: Array start index

2017-02-07 Thread Ali Çehreli via Digitalmars-d-learn

On 02/07/2017 02:11 AM, Bastiaan Veelo wrote:

> void init() {
> assert( first < cast(size_t)_payload.ptr);  //
> Address space underrun.
> assert(-first < size_t.max - cast(size_t)_payload.ptr); //
> Address space overrun.
> this._ptr = _payload.ptr - first;
> }

You forgot to call that most important function. ;)

1) I don't understand the first assert there, which does not pass for 
me, so I commented it out.


2) May bad: init() is not a good name for a struct member, so it should 
be renamed:


void initialize() {
// assert( first < cast(size_t)_payload.ptr);  // 
Address space underrun.
assert(-first < size_t.max - cast(size_t)_payload.ptr); // 
Address space overrun.

this._ptr = _payload.ptr - first;
}

3) Instead of having to remember to call it, let's introduce a function 
that does it for us:


auto makeStaticArray(T, ptrdiff_t first, ptrdiff_t last)() {
auto s = StaticArray!(T, first, last)();
s.initialize();
return s;
}

unittest {
// StaticArray!(int, -10, 10) arr;
auto arr = makeStaticArray!(int, -10, 10);

> foreach (i, ref e; arr)
> e = i;

Unrelated: That line passes because you're building 32-bits. Here is the 
error I got:


  Error: cannot implicitly convert expression (i) of type long to int

You can cast it:

e = cast(int)i;

or by

import std.conv : to;
e = i.to!int;

Ali



Re: Array start index

2017-02-07 Thread Bastiaan Veelo via Digitalmars-d-learn

On Monday, 6 February 2017 at 23:42:55 UTC, Ali Çehreli wrote:

Then you use _ptr when indexing:

// Support e = arr[5];
ref T opIndex(ptrdiff_t index) {
assert(index >= first);
assert(index <= last);
return *(_ptr + index);
}

Ali


Thank you very much for your time. Sadly this gives me an access 
violation. The traceback doesn't seem right though, as it seems 
to jump to opIndexAssign from where opIndex is used. If I'm not 
mistaken, I have confirmed that _ptr is a valid address, see 
below. We do not need to take measures against the GC?


- Bastiaan.


On Windows:

rdmd -main -unittest -debug -g source\epcompat\array.d

object.Error@(0): Access Violation

0x00402057 in void epcompat.array.__unittestL81_1() at 
C:\SARC\Pascal2017\D\epcompat\source\epcompat\array.d(88)
0x0040465C in void epcompat.array.__modtest() at 
C:\SARC\Pascal2017\D\epcompat\source\epcompat\array.d(39)
0x0040A455 in int 
core.runtime.runModuleUnitTests().__foreachbody1(object.ModuleInfo*)
0x0040C007 in int object.ModuleInfo.opApply(scope int 
delegate(object.ModuleInfo*)).__lambda2(immutable(object.ModuleInfo*))

0x00406808 in _d_run_main
0x004046D4 in main at 
C:\SARC\Pascal2017\D\epcompat\source\epcompat\array.d(7)

0x004225DD in mainCRTStartup
0x779C62C4 in BaseThreadInitThunk
0x77B50FD9 in RtlSubscribeWnfStateChangeNotification
0x77B50FA4 in RtlSubscribeWnfStateChangeNotification

The complete source:

module epcompat.array;

// Test with rdmd -main -unittest -debug -g 
source\epcompat\array.d


alias StaticArray_offset StaticArray;

/*
 * A fixed-length array with an index that runs from $(D_PARAM 
first)

 * to $(D_PARAM last) inclusive.
 *
 * Implemented by means of an offset pointer.
 */
struct StaticArray_offset(T, ptrdiff_t first, ptrdiff_t last) {
T[last - first + 1] _payload;
T* _ptr;

void init() {
assert( first < cast(size_t)_payload.ptr);  
// Address space underrun.
assert(-first < size_t.max - cast(size_t)_payload.ptr); 
// Address space overrun.

this._ptr = _payload.ptr - first;
}

size_t length() {
return _payload.length;
}

// Support e = arr[5];
ref T opIndex(ptrdiff_t index) {
assert(index >= first);
assert(index <= last);
return *(_ptr + index);
}

// Support arr[5] = e;
void opIndexAssign(U : T)(auto ref U value, ptrdiff_t index) {
assert(index >= first);
assert(index <= last);
*(_ptr + index) = value;
}   // Line 39

// Support foreach(e; arr).
int opApply(scope int delegate(ref T) dg)
{
int result = 0;

for (int i = 0; i < _payload.length; i++)
{
result = dg(_payload[i]);
if (result)
break;
}
return result;
}

// Support foreach(i, e; arr).
int opApply(scope int delegate(ptrdiff_t index, ref T) dg)
{
int result = 0;

for (int i = first; i <= last; i++)
{
result = dg(i, *(_ptr + i));
if (result)
break;
}
return result;
}

// Write to binary file.
void toFile(string fileName)
{
import std.stdio;
auto f = File(fileName, "wb");
if (f.tryLock)
{
f.rawWrite(_payload);
}
}
}

unittest {
StaticArray!(int, -10, 10) arr;
assert(arr.length == 21);

foreach (ref e; arr)
e = 42;

assert(arr[-10] == 42); // Line 88
assert(arr[0]   == 42);
assert(arr[10]  == 42);

foreach (i, ref e; arr)
e = i;

assert(arr[-10] == -10);
assert(arr[0]   ==   0);
assert(arr[5]   ==   5);
assert(arr[10]  ==  10);

arr[5] = 15;
assert(arr[5]   == 15);
}



Re: Array start index

2017-02-06 Thread WhatMeForget via Digitalmars-d-learn

On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:
Does the D language set in stone that the first element of an 
array _has_ to be index zero?
Wouldn't starting array elements at one avoid the common 
'off-by-one' logic error, it does

seem more natural to begin a count at 1.

Actually, maybe even better to allow array definitions of form
int foo[x:y];
(y >= x) creating integer variables foo[x], foo[x+1],...,foo[y].

I think the (very old) IBM PL/I language was like this.


There is a good metaphor in multi-story buildings. Americans 
number the first or ground floor of a high rise as floor #1.  In 
the UK, their first floor is what Americans would call the 2nd 
floor.


But then how can you trust a group of people who drive on the 
wrong side of the road :)







Re: Array start index

2017-02-06 Thread bachmeier via Digitalmars-d-learn

On Monday, 6 February 2017 at 18:55:19 UTC, pineapple wrote:
One reason for zero-based indexes that isn't "it's what we're 
all used to" is that if you used one-based indexes, you would 
be able to represent one fewer index than zero-based, since one 
of the representable values - zero - could no longer be used to 
represent any index.


Also, it's what we're all used to, and it makes perfect sense 
to a lot of us, and the only times in recent memory I've ever 
made off-by-one errors were when I was trying to use Lua and 
its one-based indexing.


In the scientific programming community, languages like Matlab, 
Fortran, and R use one-based indexing for a very good reason. 
Zero-based indexing is a computer science thing that makes no 
sense to someone working with equations.


Re: Array start index

2017-02-06 Thread Ali Çehreli via Digitalmars-d-learn

On 02/06/2017 03:00 PM, Bastiaan Veelo wrote:

> In "Numerical Recipes in C", section 1.2, Press et al. propose an easy
> solution using an offset pointer:
>
> float b[4], *bb;
> bb = b - 1;
>
> Thus bb[1] through bb[4] all exist, no space is wasted nor is there a
> run-time overhead.
>
> I have tried to adapt the internals of my struct to Press' approach, but
> it seems to me it is not that easy in D -- or I'm just not proficient
> enough. Can someone here show me how that could be done?

struct StaticArray(T, ptrdiff_t first, ptrdiff_t last) {
T[last - first + 1] _payload;

If the static array has to be a member like that (as opposed to a 
dynamic one), you need an extra member:


T* _ptr;

Because structs cannot have default constructors, you need an init() 
function to establish the invariant:


void init() {
this._ptr = _payload.ptr - first;
}

Of course, that relies on the fact that _payload.ptr - first is a valid 
address to hold in a pointer.


[OT]:
// Commented-out as it seems dangerous because _payload does not
// have the same semantics as StaticArray.
//
//  alias _payload this;

Which means, you would implement length() yourself:

size_t length() {
return _payload.length;
}

Then you use _ptr when indexing:

// Support e = arr[5];
ref T opIndex(ptrdiff_t index) {
assert(index >= first);
assert(index <= last);
return *(_ptr + index);
}

Ali



Re: Array start index

2017-02-06 Thread Bastiaan Veelo via Digitalmars-d-learn

(There is an honest question in the end, please read on.)

All good reasons set aside, both in favour and against 0-based 
arrays, the only reason that is relevant to me right now is that 
we are seriously looking into translating close to a million 
lines of foreign code to D, from a language that supports arrays 
over arbitrary ranges of indices (positive and negative). 
Adapting this much program logic to a different array base is out 
of the question; this is an engineering application and lives are 
at stake. So let's not bring the base-arguments back into this 
sub-thread but focus on a performant solution.


Expanding on Marc's outset, I now have:

/*
 * A fixed-length array with an index that runs from $(D_PARAM 
first)

 * to $(D_PARAM last) inclusive.
 *
 * Indices are converted, which involves a small overhead.
 */
struct StaticArray(T, ptrdiff_t first, ptrdiff_t last) {
T[last - first + 1] _payload;

alias _payload this;

// Support e = arr[5];
ref T opIndex(ptrdiff_t index) {
assert(index >= first);
assert(index <= last);
return _payload[index - first];
}

// Support arr[5] = e;
void opIndexAssign(U : T)(auto ref U value, ptrdiff_t index) {
assert(index >= first);
assert(index <= last);
_payload[index - first] = value;
}

// Support foreach(e; arr).
int opApply(scope int delegate(ref T) dg)
{
int result = 0;

for (int i = 0; i < _payload.length; i++)
{
result = dg(_payload[i]);
if (result)
break;
}
return result;
}

// Support foreach(i, e; arr).
int opApply(scope int delegate(ptrdiff_t index, ref T) dg)
{
int result = 0;

for (int i = 0; i < _payload.length; i++)
{
result = dg(i + first, _payload[i]);
if (result)
break;
}
return result;
}

// Write to binary file.
void toFile(string fileName)
{
import std.stdio;
auto f = File(fileName, "wb");
if (f.tryLock)
{
f.rawWrite(_payload);
}
}
}

unittest {
StaticArray!(int, -10, 10) arr;
assert(arr.length == 21);

foreach (ref e; arr)
e = 42;
assert(arr[-10] == 42);
assert(arr[0]   == 42);
assert(arr[10]  == 42);

foreach (i, ref e; arr)
e = i;
assert(arr[-10] == -10);
assert(arr[0]   ==   0);
assert(arr[5]   ==   5);
assert(arr[10]  ==  10);

arr[5] = 15;
assert(arr[5]   == 15);
}

//

(The first and last indices probably don't need to be template 
arguments, they could possibly be immutable members of the 
struct; But that is not what worries me now.)
The thing is that a small overhead is paid in opIndex() and 
opIndexAssign() (the other member functions are fine). I'd like 
to get rid of that overhead.


In "Numerical Recipes in C", section 1.2, Press et al. propose an 
easy solution using an offset pointer:


float b[4], *bb;
bb = b - 1;

Thus bb[1] through bb[4] all exist, no space is wasted nor is 
there a run-time overhead.


I have tried to adapt the internals of my struct to Press' 
approach, but it seems to me it is not that easy in D -- or I'm 
just not proficient enough. Can someone here show me how that 
could be done?


Thanks!


Re: Array start index

2017-02-06 Thread Nemanja Boric via Digitalmars-d-learn

On Monday, 6 February 2017 at 18:55:19 UTC, pineapple wrote:
One reason for zero-based indexes that isn't "it's what we're 
all used to" is that if you used one-based indexes, you would 
be able to represent one fewer index than zero-based, since one 
of the representable values - zero - could no longer be used to 
represent any index.


Also, it's what we're all used to, and it makes perfect sense 
to a lot of us, and the only times in recent memory I've ever 
made off-by-one errors were when I was trying to use Lua and 
its one-based indexing.


Related:

https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html


Re: Array start index

2017-02-06 Thread pineapple via Digitalmars-d-learn
One reason for zero-based indexes that isn't "it's what we're all 
used to" is that if you used one-based indexes, you would be able 
to represent one fewer index than zero-based, since one of the 
representable values - zero - could no longer be used to 
represent any index.


Also, it's what we're all used to, and it makes perfect sense to 
a lot of us, and the only times in recent memory I've ever made 
off-by-one errors were when I was trying to use Lua and its 
one-based indexing.


Re: Array start index

2017-02-06 Thread Bastiaan Veelo via Digitalmars-d-learn

On Monday, 6 February 2017 at 14:26:35 UTC, Bastiaan Veelo wrote:

The unit test didn't detect this because it was ambiguous.


Sorry for that misinformation. I should have said that 
opIndexAssign wasn't tested. Here is a better test.


unittest {
OneBasedArray!int arr;
arr = [1,2,3];
arr ~= 4;
arr[4] = 14;
assert(arr.length == 4);
assert(arr[1] == 1);
assert(arr[2] == 2);
assert(arr[3] == 3);
assert(arr[4] == 14);
}



Re: Array start index

2017-02-06 Thread Bastiaan Veelo via Digitalmars-d-learn

On Tuesday, 4 August 2015 at 08:18:50 UTC, Marc Schütz wrote:
void opIndexAssign(U : T)(size_t index, auto ref U 
value) {


Careful here, you've got the arguments reversed. The unit test 
didn't detect this because it was ambiguous. This one isn't:


unittest {
OneBasedArray!int arr;
arr = [1,2,3];
arr ~= 14;
assert(arr.length == 4);
assert(arr[1] == 1);
assert(arr[2] == 2);
assert(arr[3] == 3);
assert(arr[4] == 14);
}



Re: Array start index

2015-08-04 Thread QAston via Digitalmars-d-learn

On Monday, 3 August 2015 at 21:32:05 UTC, DLearner wrote:
Looks like 0-base is fixed, to avoid problems with existing 
code.


But nothing stops _adding_ to the language by allowing
int[x:y] foo to mean valid symbols are foo[x], foo[x+1],..., 
foo[y].
Plus rule that int[:y] means valid symbols are foo[1], 
foo[2],..., foo[y].


That way, 1-start achieved, with no conflict with existing code?


There're quite a few things stopping this from being added to the 
language.


1. People will have to learn this new feature and it's 
interaction with gazillion of other D features.


2. There would be a redundancy - core language will have 2 array 
types while one of them can be easily implemented using the other.


3. Devs will have to maintain it - as if they don't have enough 
things to fix atm.


Really, this is so simple to do as a library - just use opIndex, 
opSlice with a template struct.


As a general rule - start asking for language features only when 
things can't be done without them.


Re: Array start index

2015-08-04 Thread via Digitalmars-d-learn

On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:
Does the D language set in stone that the first element of an 
array _has_ to be index zero?
Wouldn't starting array elements at one avoid the common 
'off-by-one' logic error, it does

seem more natural to begin a count at 1.


I, too, don't think this is a good idea in general, but I can see 
a few use-cases where 1-based indices may be more natural. It's 
easy to define a wrapper:


struct OneBasedArray(T) {
T[] _payload;

alias _payload this;

T opIndex(size_t index) {
assert(index  0);
return _payload[index-1];
}

void opIndexAssign(U : T)(size_t index, auto ref U value) 
{

assert(index  0);
_payload[index-1] = value;
}
}

unittest {
OneBasedArray!int arr;
arr = [1,2,3];
arr ~= 4;
assert(arr.length == 4);
assert(arr[1] == 1);
assert(arr[2] == 2);
assert(arr[3] == 3);
assert(arr[4] == 4);
}

Test with:

rdmd -main -unittest xx.d

This can of course be easily extended to support other bases than 
one.


Re: Array start index

2015-08-03 Thread Jonathan M Davis via Digitalmars-d-learn
On Monday, August 03, 2015 21:32:03 DLearner via Digitalmars-d-learn wrote:
 On Monday, 3 August 2015 at 13:45:01 UTC, bachmeier wrote:
  On Sunday, 2 August 2015 at 21:58:48 UTC, QAston wrote:
 
  Adding 1-indexed arrays to the language fixes nothing. Just
  write your 1-indexed array type and if you enjoy using it,
  publish it as a library. Who knows, if demand is high it may
  even end up in phobos.
 
  Oh, I don't think that's a good idea. It's too confusing to
  have more than one method of indexing within the same language.
  You just have to do a thorough job of testing, as the
  possibility of errors is something you'll have to live with,
  given the different design choices of different languages.

 Looks like 0-base is fixed, to avoid problems with existing code.

 But nothing stops _adding_ to the language by allowing
 int[x:y] foo to mean valid symbols are foo[x], foo[x+1],...,
 foo[y].
 Plus rule that int[:y] means valid symbols are foo[1],
 foo[2],..., foo[y].

 That way, 1-start achieved, with no conflict with existing code?

Almost all programming languages in heavy use at this point in time start
indexing at 0. It would be highly confusing to almost all programmers out
there to have 1-based indexing. In addition, having 0-based indexing
actually makes checking against the end of arrays and other random-access
ranges easier. You can just check against length without having to do any
math. In general, I would expect 1-based indexing to _increase_ the number
of off by one errors in code - both because 0-based indexing helps avoid
such problems when dealing with the end of the array and more importantly,
because almost everyone expects 0-based indexing.

You're really barking up the wrong tree if you're trying to get any support
for 1-based indexing in D. I doubt that you will see much of anyone who
thinks that it's even vaguely a good idea, and there's no way that Walter or
Andrei (or probably anyone in the main dev team) who is going to agree that
it's even worth considering.

I think that the reality of the matter is that if you're going to do much
programming - especially if you're going to be professional programmer - you
just need to get used to the idea that array indices start at 0. There are a
few languages out there where they don't, but they are far from the norm.

- Jonathan M Davis



Re: Array start index

2015-08-03 Thread DLearner via Digitalmars-d-learn

On Monday, 3 August 2015 at 13:45:01 UTC, bachmeier wrote:

On Sunday, 2 August 2015 at 21:58:48 UTC, QAston wrote:

Adding 1-indexed arrays to the language fixes nothing. Just 
write your 1-indexed array type and if you enjoy using it, 
publish it as a library. Who knows, if demand is high it may 
even end up in phobos.


Oh, I don't think that's a good idea. It's too confusing to 
have more than one method of indexing within the same language. 
You just have to do a thorough job of testing, as the 
possibility of errors is something you'll have to live with, 
given the different design choices of different languages.


Looks like 0-base is fixed, to avoid problems with existing code.

But nothing stops _adding_ to the language by allowing
int[x:y] foo to mean valid symbols are foo[x], foo[x+1],..., 
foo[y].
Plus rule that int[:y] means valid symbols are foo[1], 
foo[2],..., foo[y].


That way, 1-start achieved, with no conflict with existing code?



Re: Array start index

2015-08-03 Thread jmh530 via Digitalmars-d-learn

On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:
Does the D language set in stone that the first element of an 
array _has_ to be index zero?
Wouldn't starting array elements at one avoid the common 
'off-by-one' logic error, it does

seem more natural to begin a count at 1.

Actually, maybe even better to allow array definitions of form
int foo[x:y];
(y = x) creating integer variables foo[x], foo[x+1],...,foo[y].

I think the (very old) IBM PL/I language was like this.


I come from matlab and R which start matrices at 1. I used to 
think it was more natural. However after I started using numpy I 
now think 0 index is better.
Also see 
http://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html

Give it a try. You might find you like it.


Re: Array start index

2015-08-03 Thread bachmeier via Digitalmars-d-learn

On Sunday, 2 August 2015 at 21:58:48 UTC, QAston wrote:

Adding 1-indexed arrays to the language fixes nothing. Just 
write your 1-indexed array type and if you enjoy using it, 
publish it as a library. Who knows, if demand is high it may 
even end up in phobos.


Oh, I don't think that's a good idea. It's too confusing to have 
more than one method of indexing within the same language. You 
just have to do a thorough job of testing, as the possibility of 
errors is something you'll have to live with, given the different 
design choices of different languages.


Re: Array start index

2015-08-03 Thread Observer via Digitalmars-d-learn

On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:
Does the D language set in stone that the first element of an 
array _has_ to be index zero?
Wouldn't starting array elements at one avoid the common 
'off-by-one' logic error, it does

seem more natural to begin a count at 1.

Actually, maybe even better to allow array definitions of form
int foo[x:y];
(y = x) creating integer variables foo[x], foo[x+1],...,foo[y].


This experiment has already been run.  Perl used to support a $[ 
variable to set the array base.  After experience with the 
confusion and problems that causes, it was finally deprecated and 
effectively removed from the language.  See the end paragraphs of 
http://perldoc.perl.org/perlvar.html and also
http://search.cpan.org/~wolfsage/perl/ext/arybase/arybase.pm for 
more info.


Re: Array start index

2015-08-02 Thread QAston via Digitalmars-d-learn

On Saturday, 1 August 2015 at 23:02:51 UTC, bachmeier wrote:
But what type of programming are you doing? Even after decades 
of programming and trying out dozens of languages, zero-based 
indexing still gets me at times when the arrays I work with 
represent vectors and matrices. Especially when porting code 
from other languages that use one-based indexing. One of the 
nice things about D is that it gives you the tools to easily 
make the change if you want.


Adding 1-indexed arrays to the language fixes nothing. Just write 
your 1-indexed array type and if you enjoy using it, publish it 
as a library. Who knows, if demand is high it may even end up in 
phobos.


Re: Array start index

2015-08-01 Thread John Colvin via Digitalmars-d-learn

On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:
Does the D language set in stone that the first element of an 
array _has_ to be index zero?


For the builtin slice types? Yes, set in stone.

Wouldn't starting array elements at one avoid the common 
'off-by-one' logic error, it does

seem more natural to begin a count at 1.

Actually, maybe even better to allow array definitions of form
int foo[x:y];
(y = x) creating integer variables foo[x], foo[x+1],...,foo[y].

I think the (very old) IBM PL/I language was like this.


See 
https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html


As other commenters have/will point out, you can easily define a 
custom type in D that behaves as you describe, and - Dijkstra 
notwithstanding - there are valid uses for such things.


Re: Array start index

2015-08-01 Thread bachmeier via Digitalmars-d-learn

On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:
Does the D language set in stone that the first element of an 
array _has_ to be index zero?
Wouldn't starting array elements at one avoid the common 
'off-by-one' logic error, it does

seem more natural to begin a count at 1.

Actually, maybe even better to allow array definitions of form
int foo[x:y];
(y = x) creating integer variables foo[x], foo[x+1],...,foo[y].

I think the (very old) IBM PL/I language was like this.


Seems you could easily wrap an array in a struct, define opIndex 
and opSlice appropriately, and use alias this to keep the other 
properties of the array. The problem you'll run into is 
interacting with other code that assumes zero-based indexing. I 
thought about setting the first index at 1 for my library that 
embeds D inside R, since that's how it's done in R, but very 
quickly realized how confusing it would be.


Re: Array start index

2015-08-01 Thread DLearner via Digitalmars-d-learn

On Saturday, 1 August 2015 at 17:55:06 UTC, John Colvin wrote:

On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:
Does the D language set in stone that the first element of an 
array _has_ to be index zero?


For the builtin slice types? Yes, set in stone.

Wouldn't starting array elements at one avoid the common 
'off-by-one' logic error, it does

seem more natural to begin a count at 1.

Actually, maybe even better to allow array definitions of form
int foo[x:y];
(y = x) creating integer variables foo[x], 
foo[x+1],...,foo[y].


I think the (very old) IBM PL/I language was like this.


See 
https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html


As other commenters have/will point out, you can easily define 
a custom type in D that behaves as you describe, and - Dijkstra 
notwithstanding - there are valid uses for such things.


D is a C derivative, so it seems a shame not to identify causes 
of bugs in C,

and design them out in D.

For example, in C difficult to write non-trivial commercial 
programs without using pointers.

Pointer manipulation has a terrible reputation for bugs.
But in D, easy to write commercial programs without using 
pointers.

Problem has been designed away.

Similarly, off-by-one array bugs are commonplace in C.
We should seek to eliminate the source of those bugs, which 
basically reduces to the

issue that programmers find it unnatural to start a count at zero.
Whether they _should_ find a zero start unnatural is irrelevant - 
they just do as an
observed fact, so let's change the language so the issue is 
avoided (designed away).


Suggestion: if the codebase for D is considered so large that 
zero-basing cannot now be changed,
why not extend the language to allow for array definitions like 
'int[x:y] foo'?
And then have a rule that 'int[:y] bar' defines a 1-based array 
of y elements?


Re: Array start index

2015-08-01 Thread Andrej Mitrovic via Digitalmars-d-learn
On 8/1/15, DLearner via Digitalmars-d-learn
digitalmars-d-learn@puremagic.com wrote:
 D is a C derivative, so it seems a shame not to identify causes
 of bugs in C,
 and design them out in D.

This has already been done! D defines an array to be a struct with a
pointer and a length. See this article:
http://www.drdobbs.com/architecture-and-design/cs-biggest-mistake/228701625

I would argue it's not off-by-one that's causing most issues when
dealing with C arrays, but instead it's in general out-of-bounds
issues (whether it's off bye one or off by 50..) since you often don't
have the length or could easily use the wrong variable as the length.

Think about how much D code would actually have subtle off-by-one
errors if D didn't use 0-based indexing like the majority of popular
languages use. Any time you would interface with other languages you
would have to double, triple-check all your uses of arrays.

FWIW at the very beginning I also found it odd that languages use
0-based indexing, but that was before I had any significant
programming experience under my belt. By now it's second nature to me
to use 0-based indexing.


Re: Array start index

2015-08-01 Thread bachmeier via Digitalmars-d-learn

On Saturday, 1 August 2015 at 19:04:10 UTC, Andrej Mitrovic wrote:
On 8/1/15, DLearner via Digitalmars-d-learn 
digitalmars-d-learn@puremagic.com wrote:

D is a C derivative, so it seems a shame not to identify causes
of bugs in C,
and design them out in D.


This has already been done! D defines an array to be a struct 
with a pointer and a length. See this article: 
http://www.drdobbs.com/architecture-and-design/cs-biggest-mistake/228701625


I would argue it's not off-by-one that's causing most issues 
when dealing with C arrays, but instead it's in general 
out-of-bounds issues (whether it's off bye one or off by 50..) 
since you often don't have the length or could easily use the 
wrong variable as the length.


Think about how much D code would actually have subtle 
off-by-one errors if D didn't use 0-based indexing like the 
majority of popular languages use. Any time you would interface 
with other languages you would have to double, triple-check all 
your uses of arrays.


FWIW at the very beginning I also found it odd that languages 
use 0-based indexing, but that was before I had any significant 
programming experience under my belt. By now it's second nature 
to me to use 0-based indexing.


But what type of programming are you doing? Even after decades of 
programming and trying out dozens of languages, zero-based 
indexing still gets me at times when the arrays I work with 
represent vectors and matrices. Especially when porting code from 
other languages that use one-based indexing. One of the nice 
things about D is that it gives you the tools to easily make the 
change if you want.


Re: Array start index

2015-08-01 Thread Rikki Cattermole via Digitalmars-d-learn

On 1/08/2015 9:35 p.m., DLearner wrote:

Does the D language set in stone that the first element of an array
_has_ to be index zero?
Wouldn't starting array elements at one avoid the common 'off-by-one'
logic error, it does
seem more natural to begin a count at 1.

Actually, maybe even better to allow array definitions of form
int foo[x:y];
(y = x) creating integer variables foo[x], foo[x+1],...,foo[y].

I think the (very old) IBM PL/I language was like this.


In c style languages (like D) the index actually defines the offset in 
memory. Not actually the index.
So while in some languages 1 is used instead of 0, 0 maps better to the 
hardware.


Think of this byte array:

ubyte* myptr = [

|---|---|
| i | value |
|---|---|
| 0 | 1 |
| 1 | 255   |

];

For 0 start of index:
size_t i = 0;
assert(myptr[i] == 1);

For 1 start of index:
size_t i = 1;
assert(i != 0);
assert(myptr[i-1] == 1);

While this is not the complete reason why 0 is chosen, it is something 
to think about.