Re: Strange optimizer behaviour - this might be the solution

Thomas Fischbacher Wed, 15 Jan 2003 08:55:38 -0800


Digging a bit deeper into the internals of CMUCL to resolve this issue
myself, I found out the following:


* (time (let ((h (make-hash-table :test 'equalp)) (a (make-array 10))) (dotimes (j 
1000000) (setf (gethash a h) 1))))
Compiling LAMBDA NIL:
Compiling Top-Level Form:

Evaluation took:
  0.75 seconds of real time
  0.75 seconds of user run time
  0.0 seconds of system run time
  0 page faults and
  272 bytes consed.
NIL
* (time (let ((h (make-hash-table :test 'equalp)) (a (make-array 10 :element-type 
'fixnum))) (dotimes (j 1000000) (setf (gethash a h) 1))))
Compiling LAMBDA NIL:
Compiling Top-Level Form:

Evaluation took:
  1.35 seconds of real time
  1.35 seconds of user run time
  0.0 seconds of system run time
  0 page faults and
  0 bytes consed.

This roughly seems to account for the factor of two.

I suppose the problem lies buried herein: code/new-hash.lisp

(defun internal-equalp-hash (s-expr depth)
  (declare (type index depth) (values hash))
  (typecase s-expr
    ;; The pointers and immediate types.
    (list (sxhash-list s-expr depth :equalp t))
    (fixnum (ldb sxhash-bits-byte s-expr))
    (character (char-code (char-upcase s-expr)))
    [...]
    (array
     (typecase s-expr
       (simple-vector (vector-equalp-hash (truly-the simple-vector s-expr) depth))
       (vector (vector-equalp-hash s-expr depth))
       (t (array-equalp-hash s-expr depth))))

Specialized arrays, like fixnum arrays, are not simple-vector-p, so the
(truly-the simple-vector s-expr) is not passed into the
vector-equalp-hash, hence the aref looping through the vector's entries
will have to dispatch at every access, greatly reducing performance.

In fact, if I re-write vector-equalp-hash:

(defmacro vector-equalp-hash (vector depth)
  `(if (= ,depth sxhash-max-depth)
       0
     (if (or (array-has-fill-pointer-p ,vector)
             (adjustable-array-p ,vector)
             (array-displacement ,vector)
             )
         ;; Hope this interpretaton is correct:
         ;; every 1d array (=vector) which is not simple
         ;; has a fill-pointer, is adjustable, or has displacement,
         ;; OR, if not, is of one of the special "tightly packed" types.
         (array-equalp-hash ,vector ,depth)
       (let* ((length (length ,vector))
              (hash length)
              (,depth (+ ,depth 1)))
         (declare (type index length) (type hash hash))
         (cond
          ((typep ,vector '(simple-array fixnum (*)))
            (dotimes (index (min length sxhash-max-len) hash)
              (declare (type index index))
              (sxmash hash (internal-equalp-hash (aref (truly-the (simple-array fixnum 
(*)) ,vector) index) ,depth))))
          ((typep ,vector '(simple-array (signed-byte 32) (*)))
           (dotimes (index (min length sxhash-max-len) hash)
             (declare (type index index))
             (sxmash hash (internal-equalp-hash (aref (truly-the (simple-array 
(signed-byte 32) (*)) ,vector) index) ,depth))))
          ;; XXX should be exhaustive!
           (t
            (dotimes (index (min length sxhash-max-len) hash)
              (declare (type index index))
              (sxmash hash (internal-equalp-hash (aref ,vector index) ,depth)))))))))

then:

*  (time (let ((h (make-hash-table :test 'equalp)) (a (make-array 10))) (dotimes (j 
1000000) (setf (gethash a h) 1))))
Compiling LAMBDA NIL:
Compiling Top-Level Form:

Evaluation took:
  0.81 seconds of real time
  0.76 seconds of user run time
  0.0 seconds of system run time
  0 page faults and
  1080 bytes consed.
NIL
*  (time (let ((h (make-hash-table :test 'equalp)) (a (make-array 10 :element-type 
'fixnum))) (dotimes (j 1000000) (setf (gethash a h) 1))))
Compiling LAMBDA NIL:
[GC threshold exceeded with 4,296,176 bytes in use.  Commencing GC.]
[GC completed with 2,750,632 bytes retained and 1,545,544 bytes freed.]
[GC will next occur when at least 4,750,632 bytes are in use.]

Compiling Top-Level Form:
Evaluation took:
  0.8 seconds of real time
  0.8 seconds of user run time
  0.0 seconds of system run time
  0 page faults and
  0 bytes consed.
NIL
*

which is fine enough. However, when thinking about this issue a bit more,
it may be much much nicer for hashing to take just (part of) the
vector's data region as obtained by system:vector-sap (or an equivalent in
case of fixnum arrays, which do not support taking the vector-sap) and
hash this directly. (Since consing arrays returns a zeroed memory region,
we cannot have cruft in padding bytes, so that technique should be okay.)
This should have the potential to produce a considerable performance boost
for hashing of such arrays.

-- 
regards,               [EMAIL PROTECTED]              (o_
 Thomas Fischbacher -  http://www.cip.physik.uni-muenchen.de/~tf  //\
(lambda (n) ((lambda (p q r) (p p q r)) (lambda (g x y)           V_/_
(if (= x 0) y (g g (- x 1) (* x y)))) n 1))                  (Debian GNU)

Re: Strange optimizer behaviour - this *might* be the solution

Reply via email to

Re: Strange optimizer behaviour - this might be the solution