On 10/18/07, Micah Cowan <[EMAIL PROTECTED]> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Tony Godshall wrote:
> > On 10/17/07, Tony Godshall <[EMAIL PROTECTED]> wrote:
> >> On 10/17/07, Micah Cowan <[EMAIL PROTECTED]> wrote:
> >>> -----BEGIN PGP SIGNED MESSAGE-----
> >>> Hash: SHA256
> >>>
> >>> Tony Godshall wrote:
> >>>>> Well, I'm don't have much to say about about the other points but one
> >>>>> certainly does not need to keep an array for something like this- with
> >>>>> the classic pseudorandom shuffle algorithm you only need to keep a
> >>>>> count of the ones visited.  Shall I pull out my Knuth?
> >>> That... only applies if you actually keep a _queue_ around, of all the
> >>> ports that you plan to try, and shuffle it. Surely that's more waste
> >>> (65,535 shorts, versus 65,535 _bits_), not less? ...We're not shuffling,
> >>> here, we're choosing.
> >> No, the point was that with a relative prime or two you can walk in a
> >> pseudorandom pattern though, hitting each point only once needing no
> >> array at all.
> >
> > Donald E. Knuth, The Art of Computer Programming, Volume 2, 3rd
> > Edition, pp. 17-19
>
> For the record, this is not what "pseudo-random shuffle" means to me:
> for instance, http://en.wikipedia.org/wiki/Fisher-Yates_shuffle (aka
> "Knuth shuffle"), which does in fact require an in-memory set to be
> permuted.

Yeah, well, it's been 23 years since I took Data Structures, so sue me.

And the shuffle you refer to is an attempt at actual randomness, whereas
what I am talking about is explicitly a function of its *pseudo*-randomness-
it's taking advantage of a characteristic (defect?) of an earlier attempt at
real randomness.

> Yes, that appears to work quite well, as long as we seed it right;
> starting with a consistent X₀ would be just as bad as trying them
> sequentially, and choosing something that does not change several times
> a second (such as time()) still makes it likely that multiple
> invocations will choose the same first port. Probably, /dev/random as
> first choice, falling back to by gettimeofday() where that's available.
> I don't know what Windows would use. We could probably use time() as a
> last resort, though I'm not crazy about it; maybe it'd be better to fail
> to support port ranges if there's not a decent seed available, and
> support exact port specifications.

Since implementation for 2^n is relatively easy, I think people
usually write the algorithm to up to twice as many numbers as required
and then skip if out of range.

You know, I bet picking randomly and nonrepetetively from a range is a
common enough task that it's in one of the standard libraries.  If
not, it probably should be.

> Thanks for the suggestion, Tony.

If I have a though, I share.  Too much sometimes ;-)  or so my wife tells me.

Tony

Reply via email to