ID:               48224
 Updated by:       j...@php.net
-Summary:          array_rand broken sorting semantics
 Reported By:      theultramage at gmail dot com
-Status:           Open
+Status:           Assigned
 Bug Type:         Arrays related
-Operating System: Windows/FreeBSD
+Operating System: *
-PHP Version:      5.2.9
+PHP Version:      5.*, 6CVS (2009-05-09)
-Assigned To:      
+Assigned To:      andrei
 New Comment:

Assigning to Andrei who did that commit you say was wrong. Let him 
decide what to do with this. :)


Previous Comments:
------------------------------------------------------------------------

[2009-05-10 15:07:18] theultramage at gmail dot com

Description:
------------
The code of PHP_FUNCTION(array_rand) contains a conditional call to
array_data_shuffle ('shuffle()' function). It was added in
http://cvs.php.net/viewvc.cgi/php-src/ext/standard/array.c?r1=1.79&r2=1.80&;
to resolve bug http://bugs.php.net/bug.php?id=7492 . There are several
mistakes here:

1. The developer used the comparison (num_req_val == num_avail).
However, these are not constants and change a lot during the execution
of the algorithm. While it is true that the check holds true in the case
where the user requests all elements in the array, it also holds true in
the general case where both of these variables reach the value '0' at
the end of the loop, which is 10+% of the time and occurs randomly. This
causes array_rand to produce sequences that are "in ascending order, but
sometimes completely random". See
http://php.net/manual/en/function.array-rand.php#74406.

2. The developer should have never made this change in the first place.
The user's complaint was "If I ask array_rand() to pick 6 elements out
of 6, the result isn't randomized". But here the user failed to realize
that it's not array_rand's job to produce the results randomly ordered.
It's to "pick K random entries out of N". Therefore the change 1.80 was
out of scope of what array_rand was intended for. The developer should
have instead instructed the user to use shuffle() on the result if he
required a shuffled result.

My proposal: Remove the two lines that do the shuffling in array_rand's
code, since they're broken and don't make sense even if they were fixed.
It's not the function's task to randomize the output's order - that
should be done by the user if he needs to do so.

PS: A more detailed analysis can be found at
http://netvor.sk/~umage/wtf/www/php/array_rand.html .

Reproduce code:
---------------
<?php
        // also see http://php.net/manual/en/function.array-rand.php#74406

        // test case for '3 out of 10'... try this and then try '8 out of 10'
        $n = 20;
        $a = array(0,1,2,3,4,5,6,7,8,9);

        for( $i = 0; $i < $n; $i++ )
        {
          $keys = array_rand($a,3);
          printf("%d %d %d\n", $keys[0], $keys[1], $keys[2]);
        }
?>

Expected result:
----------------
Individual sequences should appear in ascending order, since that's how
the original code generated them.

Actual result:
--------------
The sequences are mostly sorted, sometimes shuffled. This occurs
randomly on each run. The percentage of shuffled sequences increases as
the requested number nears the number of entries in the array.


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=48224&edit=1

Reply via email to