Nice! (gc) was the trick.

I noticed lags still when doing a (gc 300) so I bumped it up to (gc 800)
 and it loaded 4M records in 34 seconds

I switched my code to use a (class +Invoice) and do everything in memory.

Calculating a SUM is very quick..

! (bench (let Amt 0 (mapc '((This) (inc 'Amt (: Amount))) List)))
1.152 sec

This is about the same amount of time as SQL server

I also benchmarked against R and Julia. Julia choked at around 2GB of ram
and then just spun and spun. R loaded the file in about the same amount of
time and was able to do a SUM very quickly. It choked when I was doing an
aggregate by customer number.

Sidebar: Is there a way to disable the interactive session from printing
the return of a statement? For example, if I do a (setq ABC L) where L is a
million items, I'd prefer the option of not having all million items print
on my console. I've worked around this by wrapping it in a prog and
returning NIL. Is there an easier way?

The next thing I'm working on is aggregating sums by customer. So far, (by)
and (group) have been too slow. I was pleasantly surprised to see accu run
well

! (bench (mapc '((X) (accu 'Sum X 1)) CustomerNumbers))
13.137 sec

I was expecting that I could use idx as an alternative, but I can't seem to
get it to come back in a reasonable time

As a simple test to get a list of unique customers

! (bench (mapc '((X) (idx 'UniqueCustomers X T)) CustomerNumbers))
163.309 sec

Am I doing something wrong with idx? I'm wondering if accu runs better
because I remember reading that picoLisp internally uses hashes for symbols
and properties and I think accu is using a property of the symbol to store
the values.

Thanks again

Joe





On Thu, May 31, 2012 at 12:16 PM, Alexander Burger <a...@software-lab.de>wrote:

> > using, e.g. (gc 250) instead, to pre-allocate memory. This avoids that
> > the garbage collector runs again and again.
>
> BTW, 'proc' is very useful here to check the process size:
>
> $ pil +
> : (proc 'pil)
>  PID  PPID  STARTED  SIZE %CPU WCHAN  CMD
> 25575  1831 18:14:55  1512  0.3 -      /usr/bin/picolisp
> /usr/lib/picolisp/lib.l
> -> T
>
> : (gc 250)
> -> 250
>
> : (proc 'pil)
>  PID  PPID  STARTED  SIZE %CPU WCHAN  CMD
> 25575  1831 18:14:55 258512 2.8 -      /usr/bin/picolisp
> /usr/lib/picolisp/lib.l
> -> T
> :
>
> Cheers,
> - Alex
> --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
>

Reply via email to