Thanks Tomas, I've started using nil now.

 This is what I came up with to aggregate the data. It actually runs
reasonably well. I'm sharing because I always enjoy reading other people's
picoLisp code so I figure others may as well.

My source file has 4 million rows

: (bench (pivot L 'CustNum))
35.226 sec

# outputs 31,000 rows.

My approach is to load it in as follows:

(class +Invoice)
(rel CustNum (+String))
(rel ProdNum (+String))
(rel Amount (+Number))
(rel Quantity (+Number))

(de Load ()
  (zero N)
  (setq L (make (
  (in "invoices.txt"
    (until (eof)
      (setq Line (line) )
      (setq D (mapcar pack (split Line "^I")))
      (link (new
        '(+Invoice)
        'CustNum (car (nth D 1))
        'ProdNum (car (nth D 2))
        'Amount (format (car (nth D 3)))
        'Quantity (format (car (nth D 4))) )) ) ) ) ) ) T )


I can probably clean this up.  I tinkered around with various approaches
and this was the best I could come up with in a few hours. At first I was
using something like the group from lib.l but found it to be too slow. I
think it was due to the fact that I optimize for a sorted list instead of
scanning for a match in the made list

(de sortedGroup (List Fld)
  (make
    (let (Last NIL LastSym NIL)
     (for This List
      (let Key (get This Fld)
        (if (<> Last Key)
            (prog
            (if LastSym (link LastSym))
            (off LastSym)
            (push 'LastSym Key)) )
         (push 'LastSym This)
         (setq Last Key) ) )
         (link LastSym)) ) )

And here's the piece that ties it all together:

(de pivot (L Fld)
  (let (SL (by '((X) (get X Fld)) sort L) SG (sortedGroup SL Fld))
    (out "pivot.txt"
      (for X SG
        (let (Amt 0)
          (mapc '((This) (inc 'Amt (: Amount))) (cdr (reverse X)))
          (setq Key (get (car X) Fld))
          (prinl Key "^I" Amt) ) ) ) ) )


(Load)

: (bench (pivot L 'CustNum))
35.226 sec

: (bench (pivot L 'ProdNum))
40.945 sec

It seems the best performance was by sorting, then splitting and then
summing the individual parts. It also makes for a nice report.

Sidenote: At first I thought I was getting better performance by using a
modified version of quicksort off rosetta code, but then I switched it to
the built-in sort and saw considerably better speed.

Thanks for the help everyone

On Thu, May 31, 2012 at 3:37 PM, Tomas Hlavaty <t...@logand.com> wrote:

> Hi Joe,
>
> > Sidebar: Is there a way to disable the interactive session from
> > printing the return of a statement? For example, if I do a (setq ABC
> > L) where L is a million items, I'd prefer the option of not having all
> > million items print on my console. I've worked around this by wrapping
> > it in a prog and returning NIL. Is there an easier way?
>
> you could also use http://software-lab.de/doc/refN.html#nil or
> http://software-lab.de/doc/refT.html#t
>
> Cheers,
>
> Tomas
> --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
>

Reply via email to