Re: efficiently building a large list

2012-06-03 Thread Henrik Sarvell
Here's what I think, note I can't exude the same certainty as Alex, you have to wait for his comment to get the whole picture. See this cop paste of my terminal: : (de Func () (setq L 5)) - Func : (Func) - 5 : (println L) 5 - 5 Note how the L variable is set within the function and still holds

Re: efficiently building a large list

2012-06-03 Thread Alexander Burger
Hi Joe, the core of the question is here what you actually stored in the index. Perhaps it helps if I try to explain the 'idx' structure. Each node in an 'idx' tree consists either of one (if it is a leaf node) or two (if it is an internal node) cells. The first cell holds the payload data in

Re: efficiently building a large list

2012-06-03 Thread Joe Bogner
Thank you. Very helpful. I was confused where the value was actually being stored. I was thinking that in my example it was being stored in the cell in the index. Then, I couldn't figure out how it retained it's value after I cleared out the index. Turns out that it actually stored it back on

Re: efficiently building a large list

2012-06-02 Thread Joe Bogner
Hi Henrik - Thanks for sharing. I used your approach and it ran quickly after I built the index using balance. (bench (setq SL (by '((X) (get X 'CustNum)) sort L))) T) (bench (setq SLC (mapcar '((This) (: CustNum)) SL)) T) (off A) (bench (balance 'A SLC T)) I'm stumped one piece. If I run the

Re: efficiently building a large list

2012-06-01 Thread Alexander Burger
On Thu, May 31, 2012 at 01:38:41PM -0400, Joe Bogner wrote: Calculating a SUM is very quick.. ! (bench (let Amt 0 (mapc '((This) (inc 'Amt (: Amount))) List))) 1.152 sec Minor optimization: You could also use 'sum' for that: (sum '((This) (: Amount)) List) Sidebar: Is there a way to

Re: efficiently building a large list

2012-06-01 Thread Alexander Burger
Hi Joe, This is what I came up with to aggregate the data. It actually runs reasonably well. I'm sharing because I always enjoy reading other people's picoLisp code so I figure others may as well. Yes, thanks! I can't delve into the code's logic at the moment, so just let me make some

Re: efficiently building a large list

2012-05-31 Thread Henrik Sarvell
What happens if you first load the whole thing into memory, ie (let Invoices (in ... and then work with the Invoices variable? On Thu, May 31, 2012 at 9:56 PM, Joe Bogner joebog...@gmail.com wrote: I'd like to do some analysis on a large text file of invoices - for example, grouping and

Re: efficiently building a large list

2012-05-31 Thread Alexander Burger
Hi Joe, I'd like to do some analysis on a large text file of invoices - for example, grouping and summing. Is this an efficient way to build the list? The file has 4 million rows in it. The first several hundred thousand load quickly and then I notice the time between my checkpoints taking

Re: efficiently building a large list

2012-05-31 Thread Alexander Burger
On Thu, May 31, 2012 at 05:35:09PM +0200, Alexander Burger wrote: Each line results in a list of 4 'pack'ed symbols, i.e. 5 cells plus the symbols with each at least 1 cell. So 2.4M rows should be at least 2.3 GB on a 64-bit machine. Oops! Shame! 2.4e6 times 6 cells on a 64-bit machine are

Re: efficiently building a large list

2012-05-31 Thread Alexander Burger
using, e.g. (gc 250) instead, to pre-allocate memory. This avoids that the garbage collector runs again and again. BTW, 'proc' is very useful here to check the process size: $ pil + : (proc 'pil) PID PPID STARTED SIZE %CPU WCHAN CMD 25575 1831 18:14:55 1512 0.3 -

Re: efficiently building a large list

2012-05-31 Thread Tomas Hlavaty
Hi Joe, Sidebar: Is there a way to disable the interactive session from printing the return of a statement? For example, if I do a (setq ABC L) where L is a million items, I'd prefer the option of not having all million items print on my console. I've worked around this by wrapping it in a

Re: efficiently building a large list

2012-05-31 Thread Joe Bogner
Thanks Tomas, I've started using nil now. This is what I came up with to aggregate the data. It actually runs reasonably well. I'm sharing because I always enjoy reading other people's picoLisp code so I figure others may as well. My source file has 4 million rows : (bench (pivot L 'CustNum))