Another angle, I can't tell how much of the data you collect in the perl hash structures but they are *much* more memory intensive than the pdl data arrays.
Your best chance would be to allocate the destination pdl and then use slice assignments to put the hash data into its correct place. Beware, one issue with perl is that it dies if it runs out of memory which is a pain. If you preallocate the big piddle, then maybe you'll get the crash in the perl code which could give you an idea where the memory use is. --Chris On Tue, Feb 14, 2012 at 11:22 AM, David Mertens <[email protected]> wrote: > Cliff - > > Has your client given you with some sample data so that you can try to > reproduce the error on your own machine? If so, a collection of warnings > dumped to a logfile might at least tell you which line of code is croaking. > > Allocation of large piddles (many hundreds of megabytes) has been reported > to be a problem elsewhere. One thing I have done on Linux to work around > this problem is to build a FastRaw file piece-by-piece, then memory-mapping > the file. Although this is not a possibility on Windows (no PDL support for > memory mapping on windows yet), it might provide a means for a solution. You > could build a piddle into a FastRaw file with one script, then have a > different script try to readfraw that file. If you pull in this file early > in your (second) Perl process, you have a higher likelihood of getting the > contiguous memory request that PDL needs for the large data array. > > I know, it's not ideal, but I hope that helps. I should probably try to > figure out how to add memory mapping support to Windows and then document > this technique so that others can use it. > > For building the FastRaw file, I can dig up some sample code and send it > along if that would help, but I won't be able to get to it until tonight at > the earliest (and I make no guarantees as it's Valentine's day :-) > > David > > > On Tue, Feb 14, 2012 at 9:26 AM, Clifford Sobchuk > <[email protected]> wrote: >> >> Hi Folks, >> >> I am running in to a problem where I am putting in a large amount of data >> (variable depending on log size). The data is being pushed in to a perl >> array, and then converted in to a piddle. I think that it might be the >> conversion from perl array to piddle, but am not sure. How can I find out >> where the issue exists and correct it. The end users computer (laptop) will >> often be in this situation apparently. Since the data is intermixed with >> text that needs to be used to hash each specific attribute, I can't simply >> use an rgrep or rcols import. I can use rcols for each section, this would >> result in using glue to build up the piddle slowly (groups of 20 to 100 - >> depending on the datum for that attribute). >> >> Example pseudo code. >> Foreach line { >> $index1 = $1 if (/index1:\s(\d+)\w+); >> $index2 ... >> if $datastart && ! $dataend { >> push @{$myhash{$index1}{$index2}{datum1}},$1 if (/mydata/); >> $dataend = 1 if (/$eod/); >> } >> Foreach sort(keys %myhash) { >> ....for each index >> $data1=pdl(@{$myhash{$index1}{$index2}{datum1}}); >> } >> } >> >> The raw text files are on the order of 0.5 to 14 GB and are being run on >> win32 (vista - which I know has a 2GB limit for applications). Hope that >> this provides enough information to scope the issue. >> >> Thanks, >> >> >> CLIFF SOBCHUK >> Ericsson >> Core RF Engineering >> Calgary, AB, Canada >> Phone 613-667-1974 ECN 8109 x71974 >> Mobile 403-819-9233 >> [email protected]<mailto:[email protected]> >> yahoo: sobchuk >> http://www.ericsson.com/ >> >> "The author works for Telefonaktiebolaget L M Ericsson ("Ericsson"), who >> is solely responsible for this email and its contents. All inquiries >> regarding this email should be addressed to Ericsson. The web site for >> Ericsson is www.ericsson.com." >> >> This Communication is Confidential. We only send and receive email on the >> basis of the terms set out at >> www.ericsson.com/email_disclaimer<http://www.ericsson.com/email_disclaimer> >> >> >> >> _______________________________________________ >> Perldl mailing list >> [email protected] >> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl > > > > > -- > "Debugging is twice as hard as writing the code in the first place. > Therefore, if you write the code as cleverly as possible, you are, > by definition, not smart enough to debug it." -- Brian Kernighan > > > _______________________________________________ > Perldl mailing list > [email protected] > http://mailman.jach.hawaii.edu/mailman/listinfo/perldl > _______________________________________________ Perldl mailing list [email protected] http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
