Jonathan Paton writes:
>> After I sent this I had a flash of enlightenment:
>>    $max = (sort {$a <=> $b} @_)[-1];
>> May be slower, though, I don't know.
>
> How many times have I seen this?  I mean, I've seen
> this construct many times, and the question deserves
> a place in the Perl FAQ.

How to find the min/max value of an array is certainly a FAQ, so it
should be put in the official Perl FAQ.  How does someone recommend
this?

I couldn't find the benchmark discussion from a few months ago, so
I tested the various suggestions already mentioned on the list.  I
used the following max functions:

sub max1 {
    (sort {$a <=> $b} @_)[-1];                       # sort
}
sub max2 {
    my $max = shift;
    foreach (@_) { $max = $_ if $_ > $max }          # if modifier
    return $max;
}
sub max3 {
    my $max = shift;
    map { $max = $_ if ($_ > $max) } @_;             # map in void context
    return $max;
}
sub max4 {
    my $max = shift;
    foreach (@_) { $max = $_ > $max ? $_ : $max; }   # ternary operator
    return $max;
}

I benchmarked these against arrays containing 10, 100, 1000, and 10k
elements.  The results should not be surprising to those who have been
following the discussion.  I also tested equivalent min functions with
nearly identical results.  The actually results will vary depending on
your hardware and software configuration, but these difference should
be typical.

The sort function (max1) was fastest for up to 100 elements; it was
about 160% faster for 10 elements vs foreach (max2), and about 50%
faster for 100 elements.  The map function (max3) was slowest, which
is not surprising because it has to construct a brand new array which
is then wastefully discarded.

For 1000 or more elements, the foreach function (max2) was fastest; it
was 10% faster than sort for 1000 elements and more than 100% faster
for 10k elements.  The sort method placed second for 1000 elements,
but was slowest for 10k elements, even slower than the prodigal map
function; O(n ln n) eventually dominates over O(n).

I'm a litttle puzzled as to why max2 (foreach with if modifier) is
consistently about 25% faster than max4 (foreach with ternary operator).
My guess is that the difference is due to how often the assignment is
done.  With the if modifier, the assignment is done only when necessary;
with the ternary operator, the assignment is done for every element of
the array (most of the time uselessly assigning $max = $max).

Some of the functions above could be modified to take an array
reference, rather than copying every element of the array into @_.
This would cut down the subroutine call overhead and speed up
execution, especially for larger arrays.  One could also just avoid
the subroutine call overhead completely, say by inlining:
    my $maximum = (sort {$a <=> $b} @array)[-1];

It's also worth noting that if you want both the max and the min, or
the top N values, or something similar, then the sort becomes even
more efficient since it only needs to be done once:
    my ($min, $max) = (sort {$a <=> $b} @array)[0, -1];
    my ($first, $second, $third) = (sort {$a <=> $b} @array)[-1, -2, -3];
This single sort is still O(n ln n), but this is likely faster than
repeating a O(n) foreach multiple times.  (You could modify the foreach
to return multiple values upon iterating the list once, but that would
sacrifice significant clarity.)

+ Richard J. Barbalace

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to