Re: [Boston.pm] Permutation with Replacement considered harmful RE: Combinatorics

Federico Lucifredi Wed, 23 Nov 2005 08:58:08 -0800

Hello BIll!

I did not know you were a mathematician =)


 On The terminology point you mentioned, I must squarely place the blame
of the choice of words on the way Discrete Math is taught these days.
Standard texts refer to Permutation, Combination, Permutation with
Repetition, and Combination with Repetition. I have always seen it
framed this way, but I can agree readily with your point as when I
noticed that Mathworld was calling things differently, it seemed to make
more sense that way.

 If you have any hope of rectifying the issue, I recommend you look at
the relevant wikipedia page and submit corrections.


> The proper name for this in classic probability was "Ordered Sampling
> from an Urn with Replacement"; in modern linguistic combinatorics, it's
> more simply a "String", as you've found.  

Which is a much better name for it. Anyhow, looking at the actual issue
at hand - the module. In my explorations of it, I have found that it
leves a bit to be desired in terms of performance -- generating the 65k
striungs possible in the (a b c d) alphabeth takes 10 minutes,
generating the 1'000'000 strings possible in ( a b c d e) takes 9 hours.
Obviously, something is going on.

I do not want to add frequency information, every symbol is equally
likely for me, and I would like to code up this in a way that (a) is
thread friendly, and (b), more importantly, is memory efficient, which
the module is not. 

> <HUMOR>
> If you don't need the weighting (frequency) feature, you could just use
> 
>   my $alphbet='aeiou';
>   my $n=2;
>   my @strings=grep { /[$alphabet]{$n}/ } 'a'x$n .. 'z'x$n;
>   print "@strings";
> 
> but it's not very efficient for large values of n unless the alphabet is
> dense in a..z, nor for alphabets that aren't subsets of Ascii. Boosting
> $n=6 will run out of memory.  Switching to a for loop on the .. (which
> is optimized ) gets you partial results quickly
>       my $ab="aeiou"; 
>       my $n=6; 
>       for (q{a}x$n .. q{x}x$n) {
>               next unless /[$ab]{$n}/;
>               print;
>       }
> but it takes quite a while to scan from auuuuu to eaaaaa, even on a
> gigahertz clock -- total elapsed time 5.5 minutes!  (If you only need to
> do it once, that's quicker than downloading a module .. but otherwise
> ..) So I guess having a module to build these makes good sense ... at
> least until Perl6 lets us run our regexes and parser rules backwards to
> generate strings.
> 
> </HUMOR>

I need to figure out a way to do it w/o using great amounts of ram,
sadly. Well, *need* is a strong word, but I would like to =)


> Have fun in Combinatorics land,

eheh - Actually, Combinatorics was the part of Discrete Math I liked the
least. but I have to say that calling them Permutations, Combinations
and Strings makes it more likable (the "with repetition" definition
never made sense on an instinctive level, although it is a correct
definition nonetheless)... by the way, what would a "Combination with
Repetition" be called ? humm - sounds like I have to review the
choose-pick notation.

 -f

 
_______________________________________________
Boston-pm mailing list
[email protected]
http://mail.pm.org/mailman/listinfo/boston-pm

Re: [Boston.pm] Permutation with Replacement considered harmful RE: Combinatorics

Reply via email to