Re: [Perldl] PDL speed vs Perl loop

Judd Taylor Fri, 09 Jul 2010 09:24:47 -0700

The truth is that doing anything on the extremes requires a programmer
be clever. PDL just allows you to do this better than Perl.


Even the might Perl TIMTOWTDI runs out of ideas when it comes to
computing huge datasets like this. At the very minimum, PDL allows more
alternatives at being clever to make these things work quickly.

For me personally, if I want something very simple that is guaranteed to
be very fast, I go immediately to PDL::PP and write a quick sub to do
this. You can spend 5 minutes using Inline PDL::PP to develop what
works, and then put the code in a library somewhere for future use. You
get code that's easier to maintain that way, IMO, as it doesn't need to
be as "clever" as the perl level PDL code. 

-Judd


On Fri, 2010-07-09 at 10:17 -0500, P Kishor wrote:
> On Fri, Jul 9, 2010 at 10:10 AM, Benjamin Schuster-Boeckler
> <[email protected]> wrote:
> > My experience with very large datasets in PDL comes down to this:
> >
> > USE THE SMALLEST SUITABLE DATATYPE
> >
> > I can't stress enough how important that is :-)
> >
> 
> 
> Yes, very correct. But, keep in mind, even if a single value is
> different from your smallest datatype, the *entire* piddle will get
> padded to the larger datatype.
> 
> > I'm dealing with vectors of ~500M values (whole human chromosomes, if 
> > you're interested :-). If I only need a bitmask, I use a byte() piddle, if 
> > I have counts, I use byte/ushort, and I even sometimes convert rational 
> > numbers to integers for performance reasons. I'm pretty sure that most of 
> > your problems on a mac are due to over-allocating memory.
> >
> > In general, I find that PDL DOES eat Perl's lunch a million times if you do 
> > things cleverly. I was able to do sliding window averaging on those 500M 
> > vectors using PDL::PP in a second or two, compared to hours in pure Perl.
> >
> 
> 
> I am sure. I am sold on PDL. But, it does require me to be "very
> clever," while Perl is probably more forgiving. It will probably be a
> long while before I reach the "very clever" stage and capable of using
> PDL::PP to its fullest.
> 
> We really need an extensive and updated "Cookbook" that documents all
> the best practices, with examples.
> 
> 
> 
> 
> 
> > Cheers,
> > Ben
> >
> > On 9 Jul 2010, at 16:53, P Kishor wrote:
> >
> >> Craig, David, others,
> >>
> >> I find your explanation satisfying, but not the actual results that I
> >> am getting. I am experiencing a more stable performance from Perl,
> >> with the performance scaling predictably. PDL shows itself to be more
> >> moody. From one run to another, the performance can really swing. This
> >> is on my MacBook with no other user process running (meaning, I am not
> >> ripping music or watching a movie on Hulu at the same time...).
> >>
> >> First, no doubt my simplistic PDL approach was wrong. I figured, I
> >> have to calculate one "column" based on two other "columns" -- "Hey!
> >> the PDL docs show how to get a column... use slice." So, that is what
> >> I went with. However, using Craig's better and more efficient
> >> calculation approach, I did experience much better results, but not
> >> completely.
> >>
> >> I used Craig's reworked script and ran it three times. The results are
> >> below (use fixed width font to see the results), but here is some
> >> discussion --
> >>
> >> Both David and Craig implied that making the data (the array for Perl
> >> and the piddle for PDL) would be more efficient in Perl because it
> >> would do some up-front memory allocation, so 'push'ing an element on
> >> to the array would not be costly. That is not the case. PDL is pretty
> >> good, in fact, better than Perl in converting an array into a piddle
> >> than Perl is in making the array in the first place.
> >>
> >> Another assertion was that PDL will eat Perl's lunch when it comes to
> >> calculation. That is also not the case *always*. PDL is much faster at
> >> smaller data sets. But, at a certain threshold, (for me, that
> >> threshold is 3 million), PDL gets bogged down. Actually, at 3.5
> >> million, PDL gets very slow, and at 4 million, it basically locks up
> >> my computer.
> >>
> >> Another interesting issue -- Perl seems to be better at sharing the
> >> resources. When the Perl calculation is running, my machine is
> >> responsive. I can switch back to the browser, scroll a page, etc. When
> >> the PDL calc is running, it is like my machine is frozen.
> >>
> >> This kinda worries me. If we write-up the gotchas and the limits
> >> between which PDL use is optimal, then it is "caveat emptor" and all
> >> that. However, on a more realistic front, I was hoping to use PDL with
> >> a 13 million elements piddle. I did some tests, and I found that a 2D
> >> piddle where ("first D" * "second D") = 13 million, PDL was smokingly
> >> fast. I am wondering though -- will its performance change if the
> >> piddle was a 1D piddle that was 13 million elements long? Does it
> >> matter to PDL if my dataset is a "long rope" vs. a "carpet", but both
> >> with the same "thread count" (to use a fabric analogy)?
> >>
> >> Test results (reformatted) shown below
> >>
> >>
> >> count: 10000
> >> ============================
> >>           Perl       PDL
> >> ----------------------------
> >> make data: 0.0097     0.0065
> >> calculate: 0.0064     0.0014
> >>
> >> make data: 0.0106     0.0065
> >> calculate: 0.0064     0.0014
> >>
> >> make data: 0.0104     0.0065
> >> calculate: 0.0063     0.0014
> >> ____________________________
> >>
> >>
> >> count: 100000
> >> ============================
> >>           Perl       PDL
> >> ----------------------------
> >> make data: 0.0962     0.0791
> >> calculate: 0.0624     0.0108
> >>
> >> make data: 0.0966     0.0811
> >> calculate: 0.0621     0.0109
> >>
> >> make data: 0.0966     0.0789
> >> calculate: 0.0626     0.0109
> >> ____________________________
> >>
> >>
> >> count: 1000000
> >> ============================
> >>           Perl       PDL
> >> ----------------------------
> >> make data: 0.9626     0.8014
> >> calculate: 0.6269     0.1170
> >>
> >> make data: 0.9656     0.8064
> >> calculate: 0.6275     0.1182
> >>
> >> make data: 0.9643     0.8203
> >> calculate: 0.6275     0.1168
> >> ____________________________
> >>
> >>
> >> count: 2000000
> >> ============================
> >>           Perl       PDL
> >> ----------------------------
> >> make data: 1.7542     1.5168
> >> calculate: 1.2462     0.2381
> >>
> >> make data: 1.7519     1.5221
> >> calculate: 1.2500     0.2391
> >>
> >> make data: 1.7517     1.5226
> >> calculate: 1.2699     0.2394
> >> ____________________________
> >>
> >>
> >> count: 3000000
> >> ============================
> >>           Perl       PDL
> >> ----------------------------
> >> make data: 2.5263     2.5722
> >> calculate: 1.9163     3.2107
> >>
> >> make data: 2.5411     2.2062
> >> calculate: 1.8897     6.9557
> >>
> >> make data: 2.5305     2.2822
> >> calculate: 1.9204     7.2502
> >> ____________________________
> >> On Fri, Jul 9, 2010 at 2:32 AM, Craig DeForest
> >> <[email protected]> wrote:
> >>> Wow, Puneet really stirred us all up (again).  Puneet, as David said, your
> >>> PDL code is slow because you are using a complicated expression, which
> >>> forced PDL to create and destroy intermediate PDLs (every binary operation
> >>> has to have a complete temporary PDL allocated and then freed to store its
> >>> result!).  I attach a variant of your test, with the operation carried out
> >>> as much in-place as possible to eliminate extra allocations.  PDL runs
> >>> almost exactly a factor of 10 faster on my computer than does raw Perl in
> >>> this case.
> >>> Note that the original ingestion of the Perl array to PDL is quite slow:  
> >>> it
> >>> generally takes slightly longer to create the PDL than to generate the
> >>> random numbers and create the Perl array in the first place!  That is
> >>> because PDL has to make several passes through the Perl array to determine
> >>> its size, and then has to individually probe and convert each numeric 
> >>> value
> >>> in the Perl array.
> >>>
> >>> On Jul 9, 2010, at 1:09 AM, David Mertens wrote:
> >>>
> >>> FYI, for really thorough timing results, check out Devel::NYTProf:
> >>> http://search.cpan.org/~timb/Devel-NYTProf-4.03/lib/Devel/NYTProf.pm
> >>>
> >>> You have a lot of things going on to mix up the results - you have both a
> >>> memory allocation and a calculation. As I understand it, Perl will likely
> >>> outperform PDL in the memory allocation portion of this exercise, but PDL
> >>> should have Perl's lunch for the calculation portion.
> >>>
> >>> Perl will outperform PDL in the memory allocation because in all 
> >>> likelihood,
> >>> it doesn't perform any allocation with the push. It likely already 
> >>> allocated
> >>> more than three elements for (all of) its arrays, so pushing the new value
> >>> on the array does not cost anything, except for a higher up-front memory
> >>> cost. I suspect this is where PDL is losing to Perl - Perl is performing 
> >>> the
> >>> allocation ahead of where you start the timer.
> >>>
> >>> In terms of the calculation itself, PDL should far outperform Perl. The
> >>> reason is that the actual contents of the calculation loop are very slim, 
> >>> so
> >>> the cost of all of the Perl stack manipulation should significantly 
> >>> increase
> >>> its cost. The reason Perl for loops usually make sense are because the 
> >>> code
> >>> inside the for loops often involve IO operations or other such things, in
> >>> which case the Perl stack manipulations comprise only a small portion of 
> >>> the
> >>> total compute time.
> >>>
> >>> Try a situation when Perl and PDL allocate their memory as part of the
> >>> timing and see what that gives.
> >>>
> >>> David
> >>>
> >>> --
> >>> Sent via my carrier pigeon.
> >>> _______________________________________________
> >>> Perldl mailing list
> >>> [email protected]
> >>> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >> --
> >> Puneet Kishor http://www.punkish.org
> >> Carbon Model http://carbonmodel.org
> >> Charter Member, Open Source Geospatial Foundation http://www.osgeo.org
> >> Science Commons Fellow, http://sciencecommons.org/about/whoweare/kishor
> >> Nelson Institute, UW-Madison http://www.nelson.wisc.edu
> >> -----------------------------------------------------------------------
> >> Assertions are politics; backing up assertions with evidence is science
> >> =======================================================================
> >>
> 
> _______________________________________________
> Perldl mailing list
> [email protected]
> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
-- 
____________________________
Judd Taylor
Software Engineer

Orbital Systems, Ltd.
3807 Carbon Rd.
Irving, TX 75038-3415

[email protected]
(972) 915-3669 x127

_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Re: [Perldl] PDL speed vs Perl loop

Reply via email to