Hi, Camm -- That's interesting, but I'm confused, and I'm definitely not an expert on hash tables. I looked at the files that implement the "(h)" part of ACL2(h), which is almost certainly what is involving hash tables, and it looks to me like maybe the only 'equal hash tables are for strings.
I'm forwarding this to Jared, since he is the most recent author of that code (plus, you mention him as helping with potentially related reader issues), in case he has time to shed light on this. -- Matt From: Camm Maguire <c...@maguirefamily.org> Date: Mon, 11 Nov 2013 12:49:41 -0500 Greetings! Just a followup -- the remaining time appears to be in sethash for an 'equal hash-table: ============================================================================= index % time self children called name 103979625 sethash [1] [1] 84.2 2.11 49.03 0+103979625 sethash [1] 22.58 6.16 167566772/167566772 fShash_equal [2] 0.00 20.28 119656/131885 alloc_relblock [6] 0.01 0.00 119656/205048 alloc_object [47] 103979625 sethash [1] ----------------------------------------------- 22.58 6.16 167566772/167566772 sethash [1] [2] 47.3 22.58 6.16 167566772 fShash_equal [2] 5.25 0.00 363849475/363849475 hash_eql [12] 0.91 0.00 1174935219/1174940911 eql1 [18] 0.00 0.00 12/2577623 Fand <cycle 2> [151] ----------------------------------------------- 0.29 5.78 3/14 make_cons [9] 1.06 21.19 11/14 alloc_relblock [6] [3] 46.6 1.35 26.97 14 GBC [3] 26.93 0.00 25304834/25331171 sgc_mark_object1 <cycle 1> [5] ============================================================================= This is somewhat remarkable, as the 'eql gethash calls which greatly dominate in number are no longer on the radar. Presumably the algorithm makes some complex cons, (definitely not your grandmother's '(1 2 3) list), uses an 'equal hashtable to make it equal-unique, and then uses that as a key in an 'eql hashtable for the real heavy work. This just reminded me of the work we did earlier regarding the loading of complex conses in compiled files, which overloaded the #= reader until we memoized the routine calculating the hash-equal index. This is barely necessary to the gcl compiler -- the point is to catch errors where the constant list to be compiled in changes during compilation. And as I indicated earlier, we flush the memoizing hash tables on each compile-file. This, together with the implementation of the 'hybrid' #= algorithm suggested by Jared, made the loading of these conses very fast. My question is if we've learned anything which might make the above results yet faster. By default, the hash-equal index descends no more than three levels, car and cdr, into a cons to xor up the index. It does not attempt to descend the entire structure memoizing as one goes like the compiler version. There the depth limit is much greater (1000) due to its purpose and the absence of any table. My intuition tells me that there is no way a memoized version of the generic hash-equal would pay off. It seems we would have to flush on each call, or never. It would only speed up index calculations of great depth, which is only useful in hash tables if your index is insufficiently random at the default depth of 3. This does not appear the case, as #'equal itself is absent from the profiling report, implying that the hit rate to the index is good. I suppose an 'equal hashtable could keep an 'eq hashtable internally for the life of the table. That might be interesting. In any case, I don't want to waste a lot of time reinventing some wheel. If you or any of the other hashtable experts have some wisdom here, I'd be most appreciative. Take care, -- Camm Maguire c...@maguirefamily.org ========================================================================== "The earth is but one country, and mankind its citizens." -- Baha'u'llah _______________________________________________ Gcl-devel mailing list Gcl-devel@gnu.org https://lists.gnu.org/mailman/listinfo/gcl-devel