On Wed, 18 Apr 2012 15:06:11 +0200, SomeDude wrote: > On Tuesday, 17 April 2012 at 18:51:07 UTC, H. S. Teoh wrote: >> On Tue, Apr 17, 2012 at 08:44:42PM +0200, Andrej Mitrovic wrote: >>> On 4/17/12, H. S. Teoh <[email protected]> wrote: >>> > But even then, I'm considering if .keys should return a mutable >>> > array if the key is a value type >>> >>> How about having .keys for mutable and .ikeys for immutable returns? >> >> That's one possibility. Then .keys can come with the caveat that it may >> involve hidden performance costs in the form of internal .dup's. But >> I'm not sure this is a good idea, because there are too many >> string-keyed AA's in existing code that will now suddenly become very >> slow because they use .keys instead of .ikeys, and the strings are >> getting .dup'd all over. >> >> >> T > > Hidden duplications are not acceptable in my opinion.
First of all, distinguish between back-end implementation of AA, and template interface(s), which may or may not, add restrictions and "extra" information and functions, to please one's personal idealogical D purity and immutability. Backend rt.aaA, currently is generic pointer / TypeInfo based, which is good for performance and reducing code bloat. Current version does not handle D types as well as it could. Actual BB struct, only holds TypeInfo reference for key type, which has to be initialised on adding first key. By generally returning a pointer to node value slot, aaA allows the compiler to supply the missing value TypeInfo work, which works for a good part of the functionality (but not on key removal/node deletion, for instance). aaA does not handle postblit for keys correctly. Structs with postblit and/or complex destructors are not properly done either. This can be seen by using a reference counted struct. Normal D code handles the postblit, destructors properly, (meaning I haven't tested really enough code). Entanble them in AA as key or value, and counting will not work as expected. I know I'm eccentric, so please ignore the question of why reference counting. Immutable keys would ban postblit and destructors for struct. However it is possible to have aaA implementation, with only a small cost, that handles struct postblit and destructors correctly, and refcounting (as an example, please), comes out correctly. I have working code in folder rt, viewable at http://bazaar.launchpad.net/~michael-rynn-500/d2-xml/d2-xml-dev/files/ head:/rt/ There is an IAA, defining a low level interface (really, real interface). Maybe dmd could learn to call interface directly instead of indirect C functions in aaA. However the overhead is very slight. Not possible to have aaA integrated with this without using template, as dmd might not supply initilization TypeInfo for value type. Integration with current aaA C functions almost worked, but not quite, after I found I needed to initialise value TypeInfo. Need to declare with the aaI.AssociativeArray template. The template below will support non-immutable keys for lookups for string types, but not for replace or insert of keys. This is why inX method is const void* key, believe it or not, so it is easily persuaded to take string or char[]. delX should be const as well, may have to fix this. In order to demonstrate IAA is more or less complete and working interface for an aaA, I have made 2 different implementations, which use IAA http://bazaar.launchpad.net/~michael-rynn-500/d2-xml/d2-xml-dev/view/ head:/rt/aaSLink.d, a direct rip off, with above mentioned few warts erased, of the current rt.aaA implementation. and http://bazaar.launchpad.net/~michael-rynn-500/d2-xml/d2-xml-dev/view/ head:/rt/aaArrayMap.d which uses entirely arrays, (even a double linked list of integer indexes is an array). I'm telling another lie here, it manages a partial implementation of untyped arrays, rt.unarray, which uses TypeInfo to manage memory content for keys and values, including postblit annoyances, to implement the genericity implied by IAA. Its a little bit slower, which shows a bit with over a million keys, but acceptable. arrayMap maintains insertion order, as long as no key deletions followed by insertions. Now that there is 2 , or maybe more potential implementations, need a factory method, to create a default, if no deliberate initialisation. Thats currently done in rt.aadefault, which can use a version flag, to initialise the global factory. alias IAA function(TypeInfo keyti,TypeInfo valueti) createDefaultAA; shared createDefaultAA gAAFactory; interface IKeyRange { void popFront(); AAKeyRef front(); bool empty(); } interface IValueRange { void popFront(); AAValueRef front(); bool empty(); } interface IAA { /// equals checks same implementation. bool equals(IAA other, TypeInfo_AssociativeArray rawTi); /// setup types. Won't work /// void init(TypeInfo_AssociativeArray aati); /// setup types void init(TypeInfo kti,TypeInfo vti); /// implement initialisation from a literal va_list. Won't work. ///void init(TypeInfo_AssociativeArray ti, size_t length, ...); /// implement initialisation from a literal. Won't work. /// void init(TypeInfo_AssociativeArray ti, void[] keys, void[] values); /// return number of stored key-value pairs as a property size_t length(); /// append all the values from array index pairs, matching key and value types. /// return increase in size void append(void[] keys, void[] values); /// append or replace single key value pair. return 0 if successful (for apply2 delegate) //int append(void* key, void* value); dg2_t getAppendDg(); // return a int append(void* key, void* value); /// reference counting , or not void release(); void addref(); /// rehash. Return IAA IAA rehash(); /// Range interface for keys IKeyRange byKey(); /// Range interface for values IValueRange byValue(); /// remove key value pair bool delX(void* pkey); /// get, and return if location was created or was existing void* getX(void* pkey, out bool isNew); /// Return existing or null void* inX(const void* pkey); /// return unique copy of array of values ArrayRet_t values(); /// return unique copy of array of keys ArrayRet_t keys(); /** At least length 2, 0 = length of hash map. 1 = count of nodes which are empty. 2 = count of nodes with chain length 1, and so on, to length of array. */ uint[] statistics(); /// return new instance with copy of each key-value pair //IAA dup(); /// Delegate for each key int apply(dg_t dg); /// Delegate for each key and value int apply2(dg2_t dg); /// Typeinfo for key. TypeInfo keyType(); /// Typeinfo for value. TypeInfo valueType(); /// remove all pairs void clear(); /// Get new interface with same configuration, but empty IAA emptyClone(); } // how D will not see it struct AA { IAA a; } alias IAA function(TypeInfo keyti,TypeInfo valueti) createDefaultAA; shared createDefaultAA gAAFactory; struct AssociativeArray(Key, Value) { private: IAA p; public: alias AssociativeArray!(Key,Value) MyAAType; /// Support non-immutable lookups for immutable string types. /// TODO: make this more concise and broader? static if (isSomeString!Key) { static if (is(Key==string)) { alias const(char)[] CKey; } else static if (is(Key==wstring)) { alias const(wchar)[] CKey; } else static if (is(Key==dstring)) { alias const(dchar)[] CKey; } } else { alias Key CKey; } @property size_t length() { return (p !is null) ? p.length : 0; } void init(createDefaultAA factory) { if (p) { p.release(); p = null; } p = factory(typeid(Key), typeid(Value)); } /// Post-Blits this(this) { if (p) { p.addref(); } } ~this() { if (p) { p.release(); p = null; } } MyAAType rehash() @property { p = p.rehash(); return MyAAType(p); } Value[] values() @property { return (p !is null) ? *cast(Value[]*) p.values() : null; } Key[] keys() @property { return (p !is null) ? *cast(Key[]*) p.keys() : null; } int opApply(scope int delegate(ref Key, ref Value) dg) { return p !is null ? p.apply2(cast(_dg2_t)dg) : 0; } int opApply(scope int delegate(ref Value) dg) { return p !is null ? p.apply(cast(_dg_t)dg) : 0; } void opAssign(Value[Key] op) { if (p is null) init(gAAFactory); auto dg = p.getAppendDg(); foreach(k,v ; op) { dg(&k,&v); } } void opCatAssign(Value[Key] op) { if (p is null) init(gAAFactory); auto dg = p.getAppendDg(); foreach(k,v ; op) { dg(&k,&v); } } void opCatAssign(MyAAType appendMe) { auto q = appendMe.p; if (q is null) // nothing to do return; if (p is null) p = q.emptyClone(); q.apply2(p.getAppendDg()); } Value opIndex(CKey key) { auto pvalue = (p !is null) ? p.inX(&key) : null; if (pvalue is null) throw AAError.error(AAError.OPINDEX_FAIL); return *(cast(Value*) pvalue); } Value opIndex(CKey key, lazy Value defaultValue) { auto pvalue = (p !is null) ? p.inX(&key) : null; return p ? *(cast(Value*) pvalue) : defaultValue; } Value get(CKey key, lazy Value defaultValue) { auto pval = (p !is null) ? p.inX(&key) : null; return pval ? *(cast(Value*) pval) : defaultValue; } Value* opIn_r(CKey key) { return ( p !is null) ? cast(Value*) p.inX(&key) : null; } // passed value is stored if it didn't exist, returned if it did. bool getOrPut(Key key, ref Value val) { if (p is null) init(gAAFactory); bool existed = void; auto spot = cast(Value*) p.getX(&key, existed); if (!existed) *spot = val; else val = *spot; return existed; } void put(Key key, Value val) { if (p is null) init(gAAFactory); bool existed = void; auto spot = cast(Value*) p.getX(&key, existed); *spot = val; } uint[] chainLengths() { if (p is null) return [0,0]; return p.statistics(); } void opIndexAssign(Value val, Key key) { if (p is null) init(gAAFactory); bool existed = void; auto spot = cast(Value*) p.getX(&key, existed); *spot = val; } MyAAType dup() { MyAAType result; if (p !is null) { auto q = p.emptyClone(); p.apply2(q.getAppendDg()); result.p = q; } return result; } void clear() { if (p !is null) { p.clear(); } } bool remove(CKey key) { return (p !is null) ? p.delX(&key) : false; } @property auto byKey() { return (p !is null) ? p.byKey() : null; } @property auto byValue() { return (p !is null) ? p.byValue() : null; } }
