Kelly Jones wrote:
> I want an efficient subroutine that inserts $elt into @x and returns
> the percentile of $elt in @x.
> 
> For example, if @x is (2,2,3,3,3,4,5,6,7,8) and $elt is 3, the
> subroutine returns .363636... and @x is now
> (2,2,3,3,3,3,4,5,6,7,8). Why .363636...?:
> 
>  % After insertion, @x has 11 elements. $elt (3) beats 2 of those
>  elements and ties 4 (including itself). Ties count as half-wins, so
>  the new element beats 2+(4/2) or 4 elements, and 4/11 is .3636...
> 
> Below is one way to do it. I'd turn it into a subroutine if I wanted
> to use it, but it's really inefficient. If I insert 100 elements and
> want the percentile of each as its inserted, it has to do something
> like 50K comparisons. The subroutine's also ugly, but ignore that.
> 
> I'm guessing a subroutine that maintained a sorted list would work
> much better.
> 
> My ugly inefficient solution:
> 
> @mylist = (2,2,3,3,3,4,5,6,7,8);
> $elt = 3;
> push(@mylist,$elt); #insert the element first
> 
> for $i (@mylist) {
>   if ($elt>$i) {$win++; next} # how many times does $elt beat list values
>   if ($elt<$i) {$lose++; next} # how many times does $elt lose to list value
>   $tie++; # ties count as half-wins, including the element I just inserted
> }
> 
> print "WIN: $win, TIE: $tie, LOSE: $lose\n";
> # could use $#mylist+1 for denominator below; 1. is to force floating point
> print 1.*($win+$tie/2)/($win+$lose+$tie);
> print "\n";

A nice little problem. The code below will do what you want. It can be optimised
further, but make sure it's not already fast enough for you. It assumes a sorted
list, and keeps it sorted by inserting the new item before the first existing
element equal to or greater than the new one. Just sort your data before you
first use it if it isn't already done.

HTH,

Rob


use strict;
use warnings;

use List::MoreUtils qw/first_index last_index/;

my @x = (2,2,3,3,3,4,5,6,7,8);

my $percentile = insert_item([EMAIL PROTECTED], 3);

print "Percentile: $percentile\n";

sub insert_item {

  my ($array, $item) = @_;

  my $start = first_index {$_ >= $item} @$array;
  $start = @$array if $start < 0;
  splice @$array, $start, 0, $item;
  my $count = 1 + last_index {$_ == $item} @$array[$start .. $#$array];

  my $percentile = ($start + $count / 2) / @$array;

  return $percentile;
}

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to