Ken- If you could make a short script that generates the problem along with the output/error messages, that would help.
Do you have $PDL::BIGPDL set? Might try with that set to 1. I'll try the problem code on PDL-2.007 to see if that is the reason for the differences. --Chris On Thu, Nov 20, 2014 at 6:18 PM, LYONS, KENNETH B (KENNETH) <k...@research.att.com> wrote: > Chris > I'm running perl 5.8.8 on a rather old linux system. I installed the perl > modules rather recently from the PDL site, so I'd expect they are up to date > with whatever is there. From the names of the files, I'd say it's 2.007. > > I've tried a variety of ways of using the inplace method, and none of them > produced a perl error akin to what you got below. The errors were coming out > of the PDL module itself, complaining about the size of the piddle being over > 1GB. Given the dimensions of the piddle that is being calculated (around 200 > MB), that shouldn't have happened--unless it's using doubles, which would > make it ~1.6 GB. Like I said, I got around the problem in kind of a hack, by > just slicing things up 20K rows at a time--but I'd really like to find a way > to do it right! > > Among the things I tried were these: > $sigs->xchg(0,1) *= $present; > $sigs->xchg(0,1)->inplace->mult($present,0); > PDL::Ops::mult(inplace $sigs->xchg(0,1), $present, 0); > $sigs->xchg(0,1)->inplace *= present; > None of which got around the error. > > Below is what finally worked (but only by occupying more memory than it > should): > > ($psize) = $present->dims; > $STEPSIZE = 20000; > for ($p = 0; $p < $psize; $p += $STEPSIZE) { # note: it's known that > $present and $sigs have the same size! > my $start = $p; > my $end = $start+$STEPSIZE-1; > $end = $psize-1 if $end >= $psize; > $sigs->xchg(0,1)->slice("$start:$end,:,:") *= > $present->slice("$start:$end"); > } > > Like I said, it's a bit of a hack. But it does wind up doing the appropriate > filtering on the $sigs matrix. > > Ken > > p.s. I don't know if it makes a difference, per se, but you are evidently > operating in an interactive environment, not an actual perl script. I'm > using this to automate thru a very large body of data, eventually be run > automatically on a daily basis, so it's written as a script that calls the > PDL modules. The error I refer to above was appearing in the error output of > the perl command. > > KL > > ---------------------------------------------------------------------------- > > Below is the remainder of the thread that was mostly sidebar: > ------------------------------------- > Hi Ken- > > You could sync up with the message I forwarded > to perldl by replying with this message to that > thread. The main reason for keeping the discussion > on the list is so that others can benefit from the > discussion and/or offer other points of view/facts/... > > I tried the following in pdl2 and was not able to generate > an error. You are right that all byte args shouldn't be > expanded to double intermediates. I'm using PDL-2.007_03 > on cygwin64/win7 and the *= works fine but I get an error > with the inplace construct (not the same as yours) > > pdl> $sigs = (10*random(40,150000,26))->floor->byte > pdl> $present = (20*random(150000))->floor->byte > pdl> $ns = $sigs->copy > pdl> ?vars > PDL variables in package main:: > > Name Type Dimension Flow State Mem > ---------------------------------------------------------------- > $ns Byte D [40,150000,26] P 148.77MB > $present Byte D [150000] P 0.14MB > $sigs Byte D [40,150000,26] P 148.77MB > pdl> $sigs->xchg(0,1) *= $present # works > pdl> $sigs = $ns->copy > pdl> $sigs->xchg(0,1)->inplace *= $present > Runtime error: Can't modify non-lvalue subroutine call at (eval 484) line 5. > > What is your os/platform specs and what version of PDL are > you using? > > --Chris > > On Thu, Nov 20, 2014 at 2:47 PM, LYONS, KENNETH B (KENNETH) > <k...@research.att.com> wrote: >> (Didn't understand your first line, as there was no cc on this message? I >> pretty much automatically avoid ever using reply-all, but I guess in this >> case that's how it's supposed to work, right? How do I cc it to get the >> thread to match up?) >> >> Actually, all the pdls involved are byte type. I was assuming when I saw >> the errors occurring that it was somehow generating a double intermediate, >> because it should have had plenty of room if it stayed as byte. >> >> The specific code was as follows: >> >> # sigs is byte, with dimensions about 40 x 150000 x 26 >> # present is byte, with dimension of 150000 >> $sigs->xchg(0,1)->inplace *= $present; >> >> I had tried numerous ways of using inplace in that line, and none of them >> avoided the complaint that it had run out of memory (although the memory >> usage prior to that command was about 10%). So if it's not generating a >> double intermediate, I don't see why it would run out of memory (it >> shouldn't have exceeded about 20% or so). I finally got it to work by >> splitting the structures up into slices of about 20K rows each, and doing >> the calculation that way. >> >> Other approaches? >> >> Ken >> >> >> >> >> -----Original Message----- >> From: Chris Marshall [mailto:devel.chm...@gmail.com] >> Sent: Wednesday, November 19, 2014 4:23 PM >> To: LYONS, KENNETH B (KENNETH) >> Subject: Re: [Perldl] matching vectors inside a PDL >> >> <re-cc-ing the perldl list> >> >> Thanks for the background. If you hit a snag, feel free to >> post to the perldl list. We're usually able to help for >> specific problems especially if accompanied by code >> demonstrating the problem: >> >> If I have a big byte piddle $a and I multiply it in-place >> my PDL session crashes because of a huge intermediate >> temp: >> >> pdl> $a = (10*random(100))->floor->byte; >> >> pdl> $a->inplace->mult(5,0); >> Error message here or crash >> >> Without a specific example, I would guess that the problem >> is the piddle you are multiplying by (or perl scalar) is of >> type double which would result in an intermediate temp of >> double type which would then collapse down to a byte piddle >> again at the end. If both arguments to multiply are of byte >> type, you can avoid the big double intermediate temp. E.g. >> >> pdl> p pdl(5)->type >> double >> pdl> p byte(5)->type >> byte >> >> Improved type support is planned for the PDL3 work. My >> initial ideas for bitfield support can be seen here: >> >> >> http://mailman.jach.hawaii.edu/pipermail/pdl-porters/2013-December/006132.html >> >> Hope this helps, >> Chris >> >> On Wed, Nov 19, 2014 at 1:42 PM, LYONS, KENNETH B (KENNETH) >> <k...@research.att.com> wrote: >>> Chris >>> >>> In answer to your question: my path in was as follows: I wanted to find a >>> way to implement an LP on a medium-size problem (<~10K variables), and the >>> rest of my code was in perl--so I went looking for an LP implementation in >>> perl. I was expecting to find a C-compiled module that would do an LP >>> specifically. I found some instances of that sort of thing, but I also ran >>> across one using PDL. It didn't do quite what I wanted, but when I saw the >>> PDL site, it was obvious this was something I needed to know about. I >>> wound up writing my own simplex implementation in PDL to do specifically >>> what I needed, and that worked great--and I was pretty blown away at the >>> speed. So then I started looking into how I could back up and get the >>> datasets I was dealing with implemented as PDLs to start with. So I've >>> got a good bit of code now using PDL, not just the little simplex program >>> (which was only a few dozen lines--that was pretty easy to implement in >>> PDL.) >>> >>> I continue to have issues with the documentation, though. Just as one >>> example from today: the mult function seems to claim that you can get it to >>> operate in-place. And for me that was important, because I'm dealing with >>> a large dataset (of byte variables). But, not only does "mult" by itself >>> cause an error because it isn't exported, but when I try to use it as >>> PDL::Ops::Mult(inplace ...), or PDL::Mult(inplace ...), or as >>> $piddle->inplace->mult(...), it completely fails to avoid generating a >>> large intermediate. That was clobbering my program, repetitively, so I >>> finally punted and decided to break that step up into segments with only a >>> few hundred thousand elements to multiply in each (using slices), and that >>> got me around the problem. But there was nothing in the documentation that >>> seemed to suggest that would be necessary. It also seemed, although I >>> didn't document this carefully, that changing the default PDL type didn't >>> have any impact on the size of that temporary intermediate (I think it was >>> using double no matter what I did--whereas using byte would have been fine.) >>> >>> I'd love it, in this context, if there were a PDL type of "bit" by the way, >>> since that's actually what this problem is using--it's a 3D binary matrix, >>> of ones and zeroes, with up to ~3*10^^9 elements. When the number of >>> elements goes above ~200M, when I'm using bytes, I have to do things to >>> break it up and process one segment at a time, and it would be nice if that >>> weren't necessary--but there is evidently no implementation of a "bit" type >>> in PDL. >>> >>> Ken >>> >>> > > -----Original Message----- > From: Chris Marshall [mailto:devel.chm...@gmail.com] > Sent: Saturday, November 15, 2014 11:42 AM > To: Derek Lamb > Cc: LYONS, KENNETH B (KENNETH); perldl > Subject: Re: [Perldl] matching vectors inside a PDL > > Hi Ken and welcome to the PDL community! > >> On Nov 14, 2014, at 1:33 PM, LYONS, KENNETH B (KENNETH) >> <k...@research.att.com> wrote: >> >> Yes, most of this I knew, but thanks. It’s because of that behavior of > >> and <, that you mentioned, that I thought that ‘==’ would compare element by >> element instead of on the whole vector. >> >> Have you ever tried, for example, to search the documentation for, say, the >> function “list”? it gives you every occurrence of the word “list” in the >> documents (which, needless to say, is rather voluminous, and the first few >> hundred entries have nothing to do with the function!) there should be some >> analog of the “man” command in unix that gives you information about the >> *function* without all the other garbage. I think it’s just doing something >> akin to a grep thru the documents. > > In addition to the help/? and apropos/?? in the PDL shells > (pdl2 and perldl) there is the command line version pdldoc > which can be used starting with 'pdldoc pdldoc' to get the usage. > > When I do 'pdldoc -a list' I get 45 lines of output all > of whose descriptions seem relevant to a general search > for something having to do with 'list' including the 'list' > command itself. This type of problem is not specific to > PDL as searching the docs for any complicated system > or program does tend to produce a large number of > incomprehensible and not particularly useful results. > > It is definitely desired to have smarter and more useful > documentation searches. The ability to add keywords > would be nice and is on the feature request list, I believe. > > If you are on unix-ish system, you might try something > like 'pdldoc -a list | grep --color list' to make the output > more visually comprehensible. (We should probably add > that to the PDL shell output) > >> It’s horribly designed in that regard. The software itself is great, and >> I’m very happy with the results, but finding the simplest little thing in >> the docs can be a total pain! > > I understand frustration. What would really help PDL > development would be to know how you got to using > PDL without being introduced to the concepts that would > have made the learning curve less steep. (At least you > found the mailing list and used it. :-) > > Happy PDL-ing! > Chris _______________________________________________ Perldl mailing list Perldl@jach.hawaii.edu http://mailman.jach.hawaii.edu/mailman/listinfo/perldl