[Issue 4475] Improving the compiler 'in' associative array can return just a bool

2019-12-04 Thread d-bugmail--- via Digitalmars-d-bugs
https://issues.dlang.org/show_bug.cgi?id=4475

RazvanN  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||razvan.nitu1...@gmail.com
 Resolution|--- |WONTFIX

--- Comment #13 from RazvanN  ---
I'm closing this as wontfix since it seems that there was no interest in this
enhancement request

--


[Issue 4475] Improving the compiler 'in' associative array can return just a bool

2013-08-15 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=4475


hst...@quickfur.ath.cx changed:

   What|Removed |Added

 CC||hst...@quickfur.ath.cx


--- Comment #9 from hst...@quickfur.ath.cx 2013-08-15 10:06:29 PDT ---
(In reply to comment #8)
[...]
  aa[a] = new C();
  auto c = a in aa;
  aa[b] = new C();
  // Using c here is undefined as an element was added to aa
 
 This can't happen if in returns a bool.

Actually, that is not undefined. AA's are designed such that inserting new
elements does not invalidate pointers to existing elements. In D, because we
have a GC, even if you *delete* elements from AA's, pointers returned by 'in'
continue to be valid. This holds even in the event of a rehash, because the
pointer points to data in a Slot, and add/remove/rehash only shuffle pointers
in the Slot, it doesn't move the Slot around in memory.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 4475] Improving the compiler 'in' associative array can return just a bool

2013-08-15 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=4475



--- Comment #10 from bearophile_h...@eml.cc 2013-08-15 10:49:03 PDT ---
(In reply to comment #9)

 Actually, that is not undefined. AA's are designed such that inserting new
 elements does not invalidate pointers to existing elements.

I didn't know this. Is this stated somewhere in the D specs?


 This holds even in the event of a rehash,

Associative arrays have to grow when you keep adding key-value pairs, I presume
this is done allocating a new larger hash (probably 2 or 4 times larger), and
copying data in it. In such situation aren't the pointers to the items becoming
invalid? Even if the doubling is done with a realloc, it can sometimes not be
able to reallocate in place.

To test my theory I have written a small test program:


void main() {
enum size_t N = 1_000_000;
bool[immutable uint] aa;
auto pointers = new void*[N];

foreach (immutable uint i; 0 .. N) {
aa[i] = true;
pointers[i] = i in aa;
}

foreach (immutable uint i; 0 .. N)
assert(pointers[i] == (i in aa));
}


It gives no errors, so I am not understanding something. But are D specs
asserting this program will work in all D implementations?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 4475] Improving the compiler 'in' associative array can return just a bool

2013-08-15 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=4475



--- Comment #11 from hst...@quickfur.ath.cx 2013-08-15 12:03:49 PDT ---
(In reply to comment #10)
[...]
 Associative arrays have to grow when you keep adding key-value pairs, I 
 presume
 this is done allocating a new larger hash (probably 2 or 4 times larger), and
 copying data in it. In such situation aren't the pointers to the items 
 becoming
 invalid? Even if the doubling is done with a realloc, it can sometimes not be
 able to reallocate in place.

The reason it works, is because the hash table itself doesn't contain the
actual key/value pairs; it just contains pointers to linked-lists of these
key/value pairs. So the hash table can be modified however you like, but the
key/value pairs stays in the same memory address.

This would work even if we used something other than linked-lists for the
key/value pairs, e.g., trees, because the key/value pairs would just have some
pointers to neighbouring nodes, and during AA rehash (or add/delete) all that
happens is that some of these pointers get reassigned, but the node itself
(containing the key/value pair) remains in the same memory address. This kind
of implementation avoids copying/moving of keys and values, so I'd expect any
good AA implementation to do something similar.

I'm pretty sure that it's generally expected that AA implementations should
obey the principle that iterators (i.e. pointers to key/value) are not
invalidated by add/delete, otherwise it would greatly reduce the usefulness of
AA's. I'm not too sure about this also holding for rehash, but the current AA
implementation does indeed preserve references across rehash as well (though it
does break iteration order if you trigger a rehash in the middle of iterating
over the AA -- but you won't get invalid pointers out of it).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 4475] Improving the compiler 'in' associative array can return just a bool

2013-08-15 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=4475



--- Comment #12 from bearophile_h...@eml.cc 2013-08-15 12:52:19 PDT ---
(In reply to comment #11)

 the hash table itself doesn't contain the
 actual key/value pairs; it just contains pointers to linked-lists of these
 key/value pairs. So the hash table can be modified however you like, but the
 key/value pairs stays in the same memory address.

I see. But that's just an implementation detail (you could design an AA that
keeps small keys-value pairs in an array, plus a pointer to a chain for the
collisions, this is how I have created associative arrays in C), D specs can't
assert that implementation, so D code that relies on that implementation detail
goes into the realm of undefined behavour.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 4475] Improving the compiler 'in' associative array can return just a bool

2012-01-08 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=4475



--- Comment #8 from bearophile_h...@eml.cc 2012-01-08 06:47:39 PST ---
(In reply to comment #7)

 I don't see why pointers are so bad. While, yes, D is a high-level language, 
 it
 is not C# or Java.

Pointers are not evil, but they are usually more bug-prone. An example from
simendsjo:

http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learnarticle_id=31482

 aa[a] = new C();
 auto c = a in aa;
 aa[b] = new C();
 // Using c here is undefined as an element was added to aa

This can't happen if in returns a bool.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 4475] Improving the compiler 'in' associative array can return just a bool

2012-01-07 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=4475


Stewart Gordon s...@iname.com changed:

   What|Removed |Added

 CC||s...@iname.com


--- Comment #2 from Stewart Gordon s...@iname.com 2012-01-07 06:21:53 PST ---
From a semantic point of view, in needs to continue to return a pointer in
regular D, or a boolean in SafeD.

But if it's well optimised, then in most use cases the generated code would end
up the same in both cases.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 4475] Improving the compiler 'in' associative array can return just a bool

2012-01-07 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=4475



--- Comment #3 from bearophile_h...@eml.cc 2012-01-07 06:28:08 PST ---
(In reply to comment #2)
 From a semantic point of view, in needs to continue to return a pointer in
 regular D, or a boolean in SafeD.
 
 But if it's well optimised, then in most use cases the generated code would 
 end
 up the same in both cases.

I think in returning a pointer is a case of premature optimization. LDC shows
that in most real situations a compiler is able to optimize away two nearby
calls to the associative array lookup function into a single call. So I think a
better design for in is to always return a boolean, both in safe and unsafe D
code.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 4475] Improving the compiler 'in' associative array can return just a bool

2012-01-07 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=4475


Alex R�nne Petersen xtzgzo...@gmail.com changed:

   What|Removed |Added

 CC||xtzgzo...@gmail.com


--- Comment #4 from Alex R�nne Petersen xtzgzo...@gmail.com 2012-01-07 
06:28:41 PST ---
I would be against making 'in' return bool for AAs. I often do:

if (auto x = foo in someAA)
// do something with *x

Doing a lookup after checking for foo's presence in someAA is ugly compared to
this.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 4475] Improving the compiler 'in' associative array can return just a bool

2012-01-07 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=4475



--- Comment #5 from Alex R�nne Petersen xtzgzo...@gmail.com 2012-01-07 
06:29:23 PST ---
Furthermore, such a change would break way too much code.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 4475] Improving the compiler 'in' associative array can return just a bool

2012-01-07 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=4475



--- Comment #6 from bearophile_h...@eml.cc 2012-01-07 07:18:55 PST ---
(In reply to comment #4)
 I would be against making 'in' return bool for AAs. I often do:
 
 if (auto x = foo in someAA)
 // do something with *x
 
 Doing a lookup after checking for foo's presence in someAA is ugly compared to
 this.

Ugly is returning a pointer in a language like D where pointers are usually not
necessary.

What's bad/ugly in code like this? I think it's more readable:

if (foo in someAA) {
// do something with someAA[foo]

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 4475] Improving the compiler 'in' associative array can return just a bool

2012-01-07 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=4475



--- Comment #7 from Alex R�nne Petersen xtzgzo...@gmail.com 2012-01-07 
07:22:48 PST ---
If you need to use x multiple times inside the if statement's true branch, you
end up having to declare a variable, e.g.:

if (foo in someAA)
{
auto x = someAA[foo];
someFunction(otherStuff, x, x, moreStuff);
}

As opposed to:

if (auto x = foo in someAA)
someFunction(otherStuff, *x, *x, moreStuff);

I don't see why pointers are so bad. While, yes, D is a high-level language, it
is not C# or Java.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---