> [EMAIL PROTECTED] writes:
> The following script is the beginnings of my thoughts of a way of
> Scheming a converter up. 

I have since *substantially* revised the code, making it considerably
more robust and featureful.  

- It now tries to figure out, fairly intelligently, what date
formatting convention the export used.

- It handles splits competently.

There are still a number of "open issues," that will be true for *any*
approach to parsing out QIF files.  

The two most critical are that there is not yet an intelligent way of
deciding:
- What account to assign transactions to, and
- What to do about categorization of transactions.

The last time I tried importing a QIF file into GnuCash, it loaded
everything into the "main checking account," which is extremely
dysfunctional when Quicken does file-by-file exports on an
account-by-account basis.

It seems to me that there are three possible approaches:

a) Assume that transactions are to be loaded into the "current
account."

b) Interactively ask the user which account the transactions apply
to.  It appears that with MS-Money, a QIF file may contain multiple
"sections," which means that the question may get asked multiple
times.

c) Make the user customize a table/script to load things the way they
want them loaded.

A related issue is of how to handle "categories."  

If you load a QIF file into Quicken, when it encounters transactions
with new categories that it has not encountered before, you are given
three choices:
1) Change the old category to a new one that *does* exist in Quicken.
2) Establish the category as a new one in Quicken.
3) Do what's in 2) from now on; create new categories automatically.

The connection between "categories" and "accounts" is that *transfer*
transactions reference "categories" that *are* "accounts."

I'm still not to the point of creating GnuCash transactions, but this
version of the script is *substantially* better in the following ways:

a) It generates output files that are nicely human-readable.

(Incidentally, I would thoroughly support the idea of the text-based
output form creating a Scheme "define.")

b) It now works, with a one-line change, with both PLT Scheme and
Guile.

c) It now contains the components (splitter, string searcher) that
weren't included in the last "release."

The next thing that I expect to add is functionality to build a list
of all the "categories" that are referenced, which then obviously
leaves the "opening" of comparing the categories in the import file to
those existant in GnuCash.

;;;;; Open Issues:
;;;;;
;;;;; - What to do supposing the date guesses aren't good enough.
;;;;;   There are three cases that are likely:
;;;;;   1.  If something is just plain inconsistent.  (Hopefully not.)
;;;;;       In this case, how drastically do we freak out?
;;;;;   2.  Come Year 2001, the range of years will look a whole lot like
;;;;;       the range of months (or days).   Big-time ambiguity.
;;;;;   3.  If all the transactions lie in the first 12 days of the month,
;;;;;       then days will not be readily told apart from months.
;;;;;
;;;;; - What account do we load into?
;;;;;   1.  Hopefully this can be determined in an implicit manner...
;;;;;   2.  The alternative is that something interactive need be done for
;;;;;   a group of transactions, querying the user to select the appropriate
;;;;;   account.
;;;;;
;;;;; - What to do with transfers?
;;;;;
;;;;;   A transaction where the category is [AA Chequing] or [ACM MasterCard]
;;;;;   is obviously a transfer to/from that account.  Unfortunately, there is
;;;;;   no guarantee that an account by the same exact name exists in GnuCash.
;;;;;   Probably ought to cons up a list of categories, agree them to GnuCash,
;;;;;   and, most crucially, construct a "category translation table"
;;;;;   to indicate what to do with them.
;;;;;
;;;;;   The same is true, albeit less critical, for income/expense categories.
;;;;;
;;;;; - Further transfer issue:
;;;;;
;;;;;   Note that a QIF load may provide duplicate transactions for transfers,
;;;;;   once you load in the amounts for both sides of the transaction.
;;;;;
;;;;; - Category management:
;;;;;
;;;;;   What should we do if there are categories in the QIF file that don't
;;;;;   exist in GnuCash?  Create the new category, maybehaps, but probably
;;;;;   by collecting up a list, and giving the option of converting QIF
;;;;;   categories to "something new."   Again, reference back to the
;;;;;   "category translation table"
;;;;;
;;;;; - Transactions should not be marked off as being forcibly reconciled on
;;;;;   the GnuCash side, as the reconciliation hasn't been done there.  
;;;;;
;;;;;   Bad Things Happen if we double-load a batch of QIF transactions, and think
;;;;;   it has all been fully reconciled.
;;;;;


;;; Parse QIF
 (define srcdir "/home/cbbrowne/kwicken")
 (define destdir "/home/cbbrowne/kwicken/qif/")
 
 (define (directory-list srcdir)
    ;;;;; Collect filenames up into a list (similar to PLT-Scheme)
   (let
       ((directory (opendir srcdir))
        (flist '()))
     (begin
       (let loop
         ((file (readdir directory)))
         (if
          (eof-object? file)
          #f
          (begin
            (set! flist (cons file flist))
            (loop (readdir directory)))))
       (closedir directory)
       flist)))
 
 (define indir (directory-list srcdir))
 
 ; IMPLEMENTS Substring search
 ; AUTHOR Ken Dickey
 ; DATE 1991 August 6
 ; LAST UPDATED
 ; NOTES
 ;Based on "A Very Fast Substring Search Algorithm", Daniel M. Sunday,
 ;CACM v33, #8, August 1990.
 ;;
 ;; SUBSTRING-SEARCH-MAKER takes a string (the "pattern") and returns a function
 ;; which takes a string (the "target") and either returns #f or the index in
 ;; the target in which the pattern first occurs as a substring.
 ;;
 ;; E.g.: ((substring-search-maker "test") "This is a test string")  -> 10
 ;;       ((substring-search-maker "test") "This is a text string")  -> #f
 
 (define (substring-search-maker pattern-string)
   (define num-chars-in-charset 256)  ;; update this, e.g. for iso latin 1
   
   (define (build-shift-vector pattern-string)
     (let* ((pat-len (string-length pattern-string))
            (shift-vec (make-vector num-chars-in-charset 
                                    (+ pat-len 1)))
            (max-pat-index (- pat-len 1)))
       (let loop ((index 0))
         (vector-set! shift-vec 
                      (char->integer 
                       (string-ref pattern-string index))
                      (- pat-len index))
         (if (< index max-pat-index)
             (loop (+ index 1))
             shift-vec))))
   
   (let ((shift-vec (build-shift-vector pattern-string))
         (pat-len   (string-length pattern-string)))
     (lambda (target-string)
       (let* ((tar-len (string-length target-string))
              (max-tar-index (- tar-len 1))
              (max-pat-index (- pat-len 1)))
         (let outer ( (start-index 0))
           (if (> (+ pat-len start-index) tar-len)
               #f
               (let inner ( (p-ind 0) (t-ind start-index) )
                 (cond
                   ((> p-ind max-pat-index)  ; nothing left to check
                    #f)       ; fail
                   ((char=? (string-ref pattern-string p-ind)
                            (string-ref target-string  t-ind))
                    (if (= p-ind max-pat-index)
                        start-index  ;; success -- return start index of match
                        (inner (+ p-ind 1) (+ t-ind 1)) ; keep checking
                        ))
                   ((> (+ pat-len start-index) max-tar-index) #f) ; fail
                   (else
                    (outer (+ start-index
                              (vector-ref shift-vec
                                          (char->integer 
                                           (string-ref target-string
                                                       (+ start-index 
                                                          pat-len)
;;;; and now, lots of closing parentheses...
                                                       ))))))))))))))
 ;;--- E O F ---
 
;;; Functions to split up strings
;;; Provides the generic facility to split based on *any* character
;;; We make use of splitting on spaces and on colons...
 
;;; Find the next occurance of [somechar] in the string [string] 
;;; starting at [startpos]
 (define (next-somechar string startpos endpos somechar)
   (let loop 
     ; initialize
     ((pos startpos))
     (cond
       ((>= pos endpos) endpos)   ; Reached end of string
       ((char=? (string-ref string pos) somechar) pos)  ; Reached "somechar"
       (else 
        (loop (+ pos 1))))))
 
 
;;; "good" new, named-let implementation
;;; It now doesn't care about start/end positions; it just plain splits
;;; out the line
 (define (split-on-somechar sourcestring somechar)
   (let loop
     ((pos 0)
      (endpos (string-length sourcestring))
      (result '()))
     (cond
       ((>= pos endpos) result)
       (else
        (let ((nextwhatever (next-somechar sourcestring
                                           pos endpos somechar)))
          (loop
           (+ nextwhatever 1)
           endpos
           (append result (list (substring sourcestring pos nextwhatever)))))))))
 
;;; Look for the first word in a line (space-delimited)    
 (define (first-word string)
   (substring string 0 (next-somechar string 0 (string-length string) #\space)))
 
;;; Turn a line into a list
;;; (split-on-spaces "inbox outbox 1 2  3") --> ("inbox" "outbox" "1" "2" "" "3")
 (define (split-on-colons sourcestring)
   (split-on-somechar sourcestring #\:))
 
 (define (split-on-spaces sourcestring)
   (split-on-somechar sourcestring #\space))
 
 (define fullname 
   (lambda (file)
     (string-append srcdir file)))
 
 (define qifs (substring-search-maker ".qif"))
 (define QIFs (substring-search-maker ".QIF"))
 
 (define (processfile file)
   (let
       ((fullfilename (fullname file)))
     (if
      (eof-object? file)         ; Have we hit the end of the directory?
      #f
      (if (or (qifs file) (QIFs file))  ; Is it a .qif file?
          (begin
            (display "Found QIF File") (newline)
            (rewrite-file file))))))
 
 (define tlist '())
 (define atrans '())
 (define addresslist '())
 
 (define rewrite-file
   (lambda (file)    ; Opens file, rewrites all the lines, closes files
     (display "rewriting file:") (display file) (newline)
     (set! tlist '())   ; Reset the transaction list...
     (set! atrans '())
     (resetdates)  ;  Reset the date checker
     (let*
         ((infile (open-input-file (string-append srcdir file)))
          (outfile (open-output-file (string-append destdir file) 'replace)))
       (begin
         (let loop
           ((line (read-line infile)))
           (if
            (eof-object? line) #f
            (let
                ((newline (rewrite-line line)))
              (loop (read-line infile)))))
         (if 
          (checkdatemaxes)
          (let
              ((format (list date-low date-med date-high)))
            (display ";;; Date format" outfile)
            (newline outfile)
            (display "(define date-format '" outfile)
            (write format outfile)
            (display ")" outfile) 
            (newline outfile))
          (begin
            (display "Problem: Illegal date format!") (newline)
            (display ";;;; Problem - date format conflict!" outfile)
            (newline outfile)))
         (display ";;; Transactional data:" outfile)
         (newline outfile)
         (display "(define transactions '(" outfile)
         (newline outfile)
         (walk-n-print tlist outfile)
         (display ")) ;;; End of transaction data" outfile)
         (newline outfile)
         (close-input-port infile)
         (close-output-port outfile)))))
 
;;; Rewrite a line
 (define (for-every-file filelist)
   (if (null? filelist)
       #f
       (begin
         (if
          (processfile (car filelist))  ; Handle the file
          (for-every-file (cdr filelist))))))
 
 (define qifstate '())
 
 (define rewrite-line 
   (lambda (line)
     (if
      (string=? (substring line 0 1) "!")   ;;; Starts with a !
      (newstate line))                      ;;; Jump to a new state...
     (if (equal? qifstate 'txn)             ;;; If it's a transaction
         (rewrite-txn-line (striptrailingwhitespace line)))))   ;;; Rewrite it
       ;;; otherwise, do nothing...
 
 (define QIFstates  
   '(("!Type:Cat" . 'category)
     ("!Option:AutoSwitch" . 'accounts)
     ("!Clear:AutoSwitch"  . 'account)
     ("!Account" . 'accounts)
     ("!Type:Memorized" . 'memorized)
     ("!Type:Bank" . 'txn)
     ("!Type:CCard" . 'txn)
     ("!Type:Oth A" . 'txn)))
 
;;;;   Strip off trailing whitespace
 (define (striptrailingwhitespace line)
   (let
       ((stringsize (string-length line)))
     (if
      (< stringsize 1)
      ""
      (let*
          ((lastchar (string-ref line (- stringsize 1))))
        (if
         (char-whitespace? lastchar)
         (striptrailingwhitespace (substring line 0  (- stringsize 1)))
         line)))))
 
 (define (newstate line)
   (let*
       ((statepair (assoc (striptrailingwhitespace line) QIFstates)))
     (begin
       (if
        (pair? statepair)
        (set! qifstate (car (cddr statepair)))
        #f))))
 (define (transnull line)
   #f)  ;  do nothing with line
 (define (oops-new-command-type line)
   (write "Oops: New command type!")
   (write line))
 
 (define (rewrite-txn-line line)
   (let*
       ((fchar (substring line 0 1))
        (found (assoc fchar trans-jumptable)))
     (if
      found
      (let 
          ((tfunction (cdr found)))
        (tfunction line))
      (oops-new-command-type line))))
 
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 
;;;;  Variables used to handle splits  ;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 
 (define splits? #f)
 (define splitlist '())
 (define splitcategory #f)
 (define splitamount #f)
 (define splitmemo #f)
 (define splitpercent #f)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;; And functions to nuke out the splits ;;;;
;;;; at the start/end of each transaction ;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 (define (resetsplits)   ;;; Do this at end of whole txn
   (set! splits? #f)
   (set! splitlist '())
   (resetsplit))
 
 (define (resetsplit)     ;;;  After each split item
   (set! splitcategory #f)
   (set! splitmemo #f)
   (set! splitpercent #f))
 
;;;;  This function *should* validate that a split adds up to 
;;;;  the same value as the transaction, and gripe if it's not.
;;;;  I'm not sure how to usefully gripe, so I leave this as a stub.
 (define (ensure-split-adds-up)
   #f)
 
 (define (transsplitamt line)
   (set! splits? #T)
   (let*
       ((linelen (string-length line))
        (amount    (numerizeamount (substring line 1 linelen)))
        (amtlist   (cons 'amount  amount))
        (catlist   (cons 'category splitcategory))
        (entry     (list amtlist catlist)))
    ;;; And now, add amount and memo to splitlist
     (set! splitlist 
           (cons entry splitlist))))
 
;;;; percentages only occur as parts of memorized transactions
 (define (transsplitpercent line)
   (set! splits? #T)
   #f)   ;;;; Do nothing; percentages only occur in memorized transactions
 
 (define (transsplitmemo line)
   (set! splits? #T)
   (let*
       ((linelen (string-length line))
        (memo    (substring line 1 linelen)))
     (set! splitmemo memo)))
 
 (define (transsplitcategory line)
   (set! splits? #T)
   (let*
       ((linelen (string-length line))
        (category    (substring line 1 linelen)))
     (set! splitcategory category)))
 
;;;; Check the way the dates look; figure out whether it's
;;;; DD/MM/YY, MM/DD/YY, YY/MM/DD, or whatever...
 (define date-low #f)
 (define date-med #f)
 (define date-high #f)
 (define min-date-low #f)
 (define min-date-med #f)
 (define min-date-high #f)
 (define max-date-low #f)
 (define max-date-med #f)
 (define max-date-high #f)
 (define (resetdates)
   (set! date-low #f)
   (set! date-med #f)
   (set! date-high #f)
   (set! min-date-low 9999)
   (set! min-date-med 9999)
   (set! min-date-high 9999)
   (set! max-date-low 0)
   (set! max-date-med 0)
   (set! max-date-high 0))

 (define (newdatemaxes dpieces)
   (let
       ((p1 (string->number (car dpieces)))
        (p2 (string->number (cadr dpieces)))
        (p3 (string->number (caddr dpieces))))
     (if (< p1 min-date-low)
         (set! min-date-low p1))
     (if (< p2 min-date-med)
         (set! min-date-med p2))
     (if (< p3 min-date-high)
         (set! min-date-high p3))
     (if (> p1 max-date-low)
         (set! max-date-low p1))
     (if (> p2 max-date-med)
         (set! max-date-med p2))
     (if (> p3 max-date-high)
         (set! max-date-high p3))
))
 
 (define (checkdatemaxes)
   (define (favor min max)
     (display "favor:") (display min) (display " ") (display max) (newline)
     (cond
       ((> max 31) 'yy)   ;;; [max > 31] --> Year
       ((and (< max 32) (> max 12)) 'dd) ;;; Max in [13,31] --> Day
       ((= min 0) 'yy)    ;;; [min=0] --> Year 2000
       (else 'mm)))
   (display "(Low Medium High)(Low Medium High)")
   (display (list min-date-low min-date-med min-date-high max-date-low max-date-med 
max-date-high))
   (newline)
   (let
       ((vl (favor min-date-low max-date-low))
        (vm (favor min-date-med max-date-med))
        (vh (favor min-date-high max-date-high)))
     (begin
       (display "VL VM VH") (display vl) (display " ") (display vm) (display " 
")(display vh) (display " ")
       (if (or (eq? vl vm) (eq? vl vh) (eq? vm vh))
         (display "Problem: Range occurs twice!")       ; Problem!  A range appears 
twice!
         (begin
           (set! date-low vl)
           (set! date-med vm)
           (set! date-high vh))))))

;;;; "numerizeamount" takes the commaed string that QIF provides,
;;;; removes commas, and turns it into a number.
(define (numerizeamount amount-as-string)
    (let*
        ((commasplit (split-on-somechar amount-as-string #\,))
         (decommaed (apply string-append commasplit))
         (numeric   (string->number decommaed)))
    (if
        numeric    ; did the conversion succeed?
        numeric    ; Yup.  Return the value
        amount-as-string)))   ; Nope.  Return the original value.
 
;;;; At the end of a transaction, 
;;;; Insert queued material into "atrans" (such as splits, address)
;;;; Add "atrans" to the master list of transactions,
;;;; And then clear stateful variables.
 (define (transend line)   ; End of transaction
   (if (not (null? addresslist))
       (set! atrans (cons (cons 'address addresslist) atrans)))
   (if splits?
       (begin
         (set! atrans (cons (cons 'splits splitlist) atrans))
         (ensure-split-adds-up)))
   (set! tlist (cons atrans tlist))
   (set! addresslist '())
   (resetsplits)
   (set! atrans '()))
 
;;;;;;;;;;;  Various "trans" functions for different 
;;;;;;;;;;;  sorts of QIF lines    
 (define (transmemo line)
   (let*
       ((linelen (string-length line))
        (memo    (substring line 1 linelen)))
     (set! atrans (cons (cons 'memo memo) atrans))))
 
 (define (transaddress line)
   (let*
       ((linelen (string-length line))
        (addline    (substring line 1 linelen)))
     (set! addresslist (cons addline addresslist))))

;;;;;;;  Date-related code 
 (define findspace (substring-search-maker " "))
 (define findslash (substring-search-maker "/"))

;;; Replace spaces in date fields with zeros so
;;;  "4/ 7/99" transforms to "4/07/99"
 (define (replacespace0 string)
   (let
       ((slen (string-length string))
        (spacepos (findspace string)))
     (if spacepos
         (replacespace0
          (string-append
           (substring string 0 spacepos)
           "0"
           (substring string (+ 1 spacepos) slen)))
         string)))
 
 (define (transdate line)
   (let*
       ((linelen (string-length line))
        (date    (replacespace0 (substring line 1 linelen)))
        (dpieces (split-on-somechar date #\/)))
     (set! atrans (cons (cons 'date date) atrans))
     (newdatemaxes dpieces))) ; collect info on date field ordering
                              ; so we can guess the date format at
                              ; the end based on what the population
                              ; looks like

 (define (transamt line)
   (let*
       ((linelen (string-length line))
        (amount  (numerizeamount (substring line 1 linelen))))
     (set! atrans (cons (cons 'amount amount) atrans ))))
 
 (define (transid line)
   (let*
       ((linelen (string-length line))
        (id    (substring line 1 linelen)))
     (set! atrans (cons (cons 'id id) atrans ))))
 
 (define (transstatus line)
   (let*
       ((linelen (string-length line))
        (status    (substring line 1 linelen)))
     (set! atrans (cons (cons 'status status) atrans ))))
 
 (define (transpayee line)
   (let*
       ((linelen (string-length line))
        (payee    (substring line 1 linelen)))
     (set! atrans (cons (cons 'payee payee) atrans ))))
 
 (define (transcategory line)
   (let*
       ((linelen (string-length line))
        (category    (substring line 1 linelen)))
     (set! atrans (cons (cons 'category category) atrans))))
 
 (define 
   trans-jumptable
   (list 
    (cons "^"  transend) (cons "D"  transdate) 
    (cons "T"  transamt) (cons "N"  transid) 
    (cons "C"  transstatus) (cons "P"  transpayee)
    (cons "L"  transcategory) (cons "M"  transmemo)
    (cons "!"  transnull) (cons "U"  transnull)
    (cons "S"  transsplitcategory) (cons "A"  transaddress) 
    (cons "$" transsplitamt) (cons "%" transsplitpercent)
    (cons "E"  transsplitmemo)))
 
 (define (walk-n-print tlist outfile)
   (if
    (null? tlist)
    #f
    (begin
      (process-transaction (car tlist) outfile)
      (walk-n-print (cdr tlist) outfile))))
 
;;; Consider it a "no-brainer" that this is the point at which it would
;;; make sense to drop in code to take the transaction and load it into
;;; GnuCash.
 (define (process-transaction txn outfile)
   (set! txn (rewrite-dates txn))  ;;; Rewrite the date field...
   (write txn outfile)   ;;;; Alternatively, take txn and load into
                         ;;;; GnuCash 
   (newline outfile))

(define (rewrite-dates txn)
    (cond
        ((atom? txn) txn)  ; Atom - return it...
        ((list? txn)       ; List? - Split and process pieces
;            (display "Do 2 pieces:")
;            (display (car txn))
;            (display (cdr txn))
            (cons (rewrite-dates (car txn))
                  (rewrite-dates (cdr txn))))
        ((pair? txn)       ; If it's a pair, see if it's a date...
            (if (eq? (car txn) 'date)
                (cons 'date (reformat-date (cdr txn)))
                txn))))

(define (reformat-date date-as-string)
;    (display "Reformatting date:") (display date-as-string)
    (let*
        ((datesplitup (split-on-somechar date-as-string #\/))
         (p1 (string->number (car datesplitup)))
         (p2 (string->number (cadr datesplitup)))
         (p3 (string->number (caddr datesplitup)))
         (YEAR  0)
         (MONTH 0)
         (DAY   0)
         (dropin (lambda (yy-or-mm-or-dd value)
            (cond   
                ((eq? yy-or-mm-or-dd 'yy)
                 (set! YEAR value))
                ((eq? yy-or-mm-or-dd 'mm)
                 (set! MONTH value))
                ((eq? yy-or-mm-or-dd 'dd)
                 (set! DAY value))))))
        (begin
            (dropin date-low p1)
            (dropin date-med p2)
            (dropin date-high p3)
            ;;; Now, normalize year
            (if (< YEAR 80)
                (set! YEAR (+ YEAR 2000))   ;;; 1900's
                (set! YEAR (+ YEAR 1900)))  ;;; 2000's
;            (display (list YEAR MONTH DAY)) (newline)
            (list YEAR MONTH DAY))))    
 
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;; Now, let's actually execute the code...
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;(debug-enable 'backtrace)
 (for-every-file indir)

--
Christopher B. Browne, [EMAIL PROTECTED], [EMAIL PROTECTED]
Web: http://www.ntlug.org/~cbbrowne  SAP Basis Consultant, UNIX Guy
Windows NT - How to make a 100 MIPS Linux workstation perform like an 8 MHz 286
----- %< -------------------------------------------- >% ------
The GnuCash / X-Accountant Mailing List
To unsubscribe, send mail to [EMAIL PROTECTED] and
put "unsubscribe gnucash-devel [EMAIL PROTECTED]" in the body

Reply via email to