Hi, I created a database of 85958 persistent objects containing 31 string slots with items such as first, last, middle, etc names, company name, address, etc... Total disk space is about 1.3GB.
Now, I wanted to write a small routine to generate CSV files containing the information in this database. However, I wanted to generate one CSV file per state, so I wrote this code: ;;; utility functions (defun open-db () "Open the Elephant DB, if not already opened." (unless *store-controller* (open-store *dentemax-spec* :max-locks 10000 :max-objects 10000)) *store-controller*) (defun close-db () "Close the Elephant DB, if opened." (if *store-controller* (close-store))) ;;; export function ;;; basically, given a filename and a state (string), export all objects whose 'state index matches the given state (defun write-csv-file (filename state) "Export CSV file of providers in a given state." (if (open-db) (with-open-file (stream filename :direction :output :if-exists :supersede :if-does-not-exist :create) (princ-csv (csv-header) stream) ; write the header (map-inverted-index #'(lambda (k o) (declare (ignore k)) (princ-csv (record-list o) stream)) 'provider 'state :value state)))) So far, this seems to be working fine. Some notes: (time (open-db)) Evaluation took: 0.692 seconds of real time 0.232830 seconds of total run time (0.124051 user, 0.108779 system) 33.67% CPU 21 forms interpreted 315 lambdas converted 1,776,202,259 processor cycles 5,661,240 bytes consed Not bad to open the database I was able to generate a CSV file just fine: (time (write-csv-file "sample.csv" "CA")) Evaluation took: 14.196 seconds of real time 13.495860 seconds of total run time (12.846913 user, 0.648947 system) [ Run times consist of 1.064 seconds GC time, and 12.432 seconds non-GC time. ] 95.07% CPU 89 lambdas converted 36,814,250,083 processor cycles 1,419,496,056 bytes consed I then wanted to create an automated way of generating a file for each state in the database. So, I wrote this: (defun unique-states () "Generates a list of unique states from the DB." (open-db) (let ((states nil)) (map-inverted-index #'(lambda (k o) (declare (ignore o)) (pushnew k states :test #'string-equal)) 'provider 'state) (nreverse states))) and here is where my questions are: (time (unique-states)) Evaluation took: 6.970 seconds of real time 6.807557 seconds of total run time (6.672462 user, 0.135095 system) [ Run times consist of 0.290 seconds GC time, and 6.518 seconds non-GC time. ] 97.68% CPU 18,073,334,071 processor cycles 140,438,296 bytes consed ("AL" "AR" "AZ" "CA" "CO" "CT" "DC" "DE" "GA" "HI" "IA" "ID" "IL" "IN" "KS" "KY" "LA" "MA" "MD" "ME" "MI" "MN" "MO" "MS" "MT" "NC" "ND" "NE" "NH" "NJ" "NM" "NV" "NY" "OH" "OK" "OR" "PA" "RI" "SC" "SD" "TN" "TX" "UT" "VA" "VT" "WA" "WI" "WV" "WY") 1) Is it normal for it to take "so long" to "simulate" the equivalent of SELECT DISTINCT state FROM source? When I run the same query on a freshly launched MySQL instance it returned the results in: 62 rows in set (0.00 sec) 2) Is my approach of selecting the unique states inefficient? 3) Is there a better way to perform this "query" in Elephant? BTW, this is all with the latest elephant from http://www.common-lisp.net/project/elephant/darcs/elephant-1.0 in SBCL 1.0.24 on OS X and BDB 4.7.25. Thanks in advance, JD
_______________________________________________ elephant-devel site list elephant-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/elephant-devel