Ralf Hemmecke wrote: > > On 10/07/2015 12:26 PM, Waldek Hebisch wrote: > >> If you are satisfied with random integers, then that must be stated > >> in the category documentation. > > > > You mean that we should state explicitely that random elements need > > not exhaust the domain? > > Yes, why not? If (as an implementor) I have such information, I know > that I don't need to bother with exhausting the domain.
I hoped that properties given in my decription was vague enough so that implementer could infer that there is a lot of freedom, in particular that there is no promise to exhaust domain the domain. > >> In fact, I more and more get the impression that you don't actually > >> want "RandomChoice". What you want is a stream of elements of % > >> that do not repeat. Am I wrong? > > > It would be nice if we could get non-repetition for free. > > Well, if we had "Generator(X)" in SPAD, that would make it easier to > implement such non-repeating stream. Maybe one could also use Stream(X) > or even return a function of type ()->X that returns the "next" element. > > > But it seems that user have more info how to handle possible > > repetition. > > Yes, maybe. But a function that returns some element in a somewhat > "uniform" distribution, would also have to take care of repeating elements. I do not understand what you mean: if we generate uniformly distributed and independent elemets then we automatically have posivite probability of repetition. No need to extra introduce them. As I wrote, in case when repeated elements are not wanted the user can remove them. However, some users may tolerate them, so why spent extra effort when this is not needed. Let me say this differently: capability of generating somewhat random elements depends on specific domain. It is not possible to implement it without domain-specific code. OTOH removal of duplicates can be dane in generic way and I do not see how domain could improve it. RandomChoice is intended to cover just the special part, leaving generic stuff to packages. > > OTOH merely having non-repeating sequence is not enough. > > Can you say why? You mentioned gcd and factorization. But I don't > understand. Primality testing would probably a different business. In common case we have a finite bad set S. We need to find some element which does not belong to S. Of course we hope that "random" element is not in S. However, merely replaying some fixed sequence without repetition may lead to bad behavoiur. Namely, if S is large and agrees with initial segment of our sequence, then we may wait quite long for first element not in S. Similaraly, if elements are not random enough we may get a lot of elements from S. > > In practice we enlagre n as geometric sequence, so we need enough > > randomness. > > Waldek, maybe I am the only one here on the list that does not get the > real purpose of your intention from the docstring that you have given. > Can you give more concrete examples of where you would like to apply > randomElement? Look at: https://en.wikipedia.org/wiki/Schwartz-Zippel_lemma and think what you really need to use this lemma. You will see that we really do not care about exact set from which we take the elements. All we need is inequality p(a) <= 1/n for each values that may appear. > >> Well. That somehow sounds like you want to easily throw away "unlucky > >> primes". > > > > That separate topic. > > Really? Well, for sure "deciding unluckyness" must be done by the user, > but otherwise (at least for gcd) it would be enough if randomElement > does not return the same (now known to be unlucky) element over and over > again. Getting the same element is extremally unlikely. Anyway, if needed user can eliminate duplicates in generic way. Also, avoiding unlucky primes may involve some extra effort. For example guessing package in general does not know if given prime is unlucky. But after taking random prime we take also evaluatin point and at the end we may discover that the whole tuple (prime, points) is unlucky. If for given prime we get too many unlucky points we move to next prime. And to make things concrete: guessing packeage can not tolerate duplicates, so it keeps list of primes that was used and checks if value is on this list. AFAICS we can not avoid storing already generated elements, and guessing package needs to store them anyway. So trying to remove duplicates in 'random' would just increase memory use. ATM check in guessing packeage is done via linear search and in principle binary search or hashing could be faster. However current check is good enough. > > I still don't get exactly what the purpose of your function is, but let > me try in documenting it and maybe I would change the signature a bit. > > randomElement: Integer -> Union(() -> %, failed) > ++ randomElement(n) randomly selects n elements from the domain > ++ and returns a function "random" so that random() chooses randomly > ++ among those n preselected elements. If the underlying domain does > ++ not contain n elements then randomElement(n) returns failed. > > That would be something I understand as an implementor. > Implementationwise, that does (of course) not mean, that the call > "randomElement(n)" first computes a *list* of n elements. Such details > are left open. I do not see why you want to return a function? From my point of view this is just useless complication. Also what "randomly selects n elements" means? To have random choice we need a probability distribution and the whole point of RandomChoice is that we do not have any natural distribution. Given that we do not want the intermediate function we would get the following: ++ randomElement(n) first selects n distinct elements from the ++ domain and then randomly chooses one of them with uniform ++ distribution. If the underlying domain does ++ not contain n elements then randomElement(n) returns failed I also wrote that elements should be distinct because otherwise one gets completely unnecessary complications trying to estimate probabilities. However, this splitting into first choosing set and than selectiong element from the set does not look clearer for me than my formulation. Note: having _both_ is certainly clearer than just one. In my formulation reader must infer that logically there is selection of a set. You put stress on selecting set, but than reader must infer that in fact this selection is only at logical level. Your formulation is more precise (provided that we say that n elements are distinct). OTOH I deliberatly did not want to make very precise statement. Namely, I do not care of set is slightly bigger, form probability point of view this is only better. Also it does not matter is probabilities are exatly uniform on choosen set, what matter is that they are low enough. Let me add that I did not make statement that probablity of given element is <= 1/n because in general it hard to say samething exact abot probabilities from pseudo-random generator. If generator is good then there will be no observable difference. But some tiny difference may in principle appear due to combination if imperfect randomness and way we produce elements. However, if you feel that string claim about probability will be correctly understood, then we can make it. BTW: I have now checked one argument 'random'. We have two random : PositiveInteger -> ... but they are in packages, so there will be no conflict. The is one argument 'random' in PermutationGroup, but this one have different argument type. There is conflict with 'random(n)' in integer domains, because our new 'random' returns union, while random in integer returns integers. We could resolve this by specifying return type. I am not sure if we want to do this: there are of order few tens calls to 'random(n)', some will pick correct version because type is already determined. But in say 20 places we would have to use explicit '@Integer'. -- Waldek Hebisch -- You received this message because you are subscribed to the Google Groups "FriCAS - computer algebra system" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/fricas-devel. For more options, visit https://groups.google.com/d/optout.
