Re: [fricas-devel] Random choice

Ralf Hemmecke Thu, 08 Oct 2015 02:13:48 -0700

Hi Waldek,

Before I go on. Excuse me, when I didn't understand your proposal in the
first place. I hope it isn't just a waste of time for you to make it
clear to me.


> I hoped that properties given in my decription was vague enough
> so that implementer could infer that there is a lot of freedom,
> in particular that there is no promise to exhaust domain the domain.

After our discussion I got a bit closer.

>>>           ++ random_element(n) generates random element
>>>           ++ with values in some set having of order \spad{n}
>>                                       ^^^^^^^^^^^^^^^^^^^^^^^^
>> That is something I don't understand.
>>
>> If I had to implement that category for SparseUnivariatePolynomial(R)
>> what exactly should I implement then? "order n" is not a well defined
>> concept. I would understand randomElement(n) chooses from a pool of n
>> elements. The question then only is: what is the pool?
>>
>>>           ++ elements and with probability distribution that is
>>>           ++ roughly uniform on this set.  Returns failed if
>>>           ++ \spad{%} has less than \spad{n} elements.

I was criticising the phrase "having of order n elements", because it
didn't sounded correct English to my (non-native-english) ears. I would
have better understood, if it had been "having at least n elements".

>> Yes, maybe. But a function that returns some element in a somewhat
>> "uniform" distribution, would also have to take care of repeating elements.
> 
> I do not understand what you mean: if we generate uniformly
> distributed and independent elemets then we automatically
> have posivite probability of repetition.  No need to extra
> introduce them.

What I meant was that in order to provide "uniform distribution" one
must make sure that one element isn't repeated indefinitely. That didn't
mean that randomElement would have to remember which element it already
delivered. Even if "randomElement(n)" just selects n elemenst and than
uses "random(n)" as an index into this set, it would "take care of
repeating elements".

> Let me say this differently: capability of generating somewhat
> random elements depends on specific domain.

Nonetheless, if a domain would have a way to stream its elements, that
could probably be used for an implementation of the function you want.
But sure, it looks like overshooting for your "random" case.
Again, I didn't get what you were up to.

>>> OTOH merely having non-repeating sequence is not enough.
>>
>> Can you say why? You mentioned gcd and factorization. But I don't
>> understand. Primality testing would probably a different business.
>  
> In common case we have a finite bad set S.  We need to find
> some element which does not belong to S.  Of course we hope
> that "random" element is not in S.  However, merely replaying
> some fixed sequence without repetition may lead to bad behavoiur.
> Namely, if S is large and agrees with initial segment of our
> sequence, then we may wait quite long for first element
> not in S.  Similaraly, if elements are not random enough
> we may get a lot of elements from S.

Still, except for a bad running time a stream of non-repeating elements
*is* enought. And nobody forbids you to pass the stream on so that
already consumed elements are not considered again. Furthermore, one
could even require that the function does not deliver the same sequence.
Anyway, delivering random non-repeating sequences is perhaps even more
overshooting.

> Look at:
> 
> https://en.wikipedia.org/wiki/Schwartz-Zippel_lemma
> 
> and think what you really need to use this lemma.  You will
> see that we really do not care about exact set from which
> we take the elements.  All we need is inequality p(a) <= 1/n
> for each values that may appear.

Had you mentioned primality testing in the first place, I would have
better understood what you were up to.

>> I still don't get exactly what the purpose of your function is, but let
>> me try in documenting it and maybe I would change the signature a bit.
>>
>> randomElement: Integer -> Union(() -> %, failed)
>>   ++ randomElement(n) randomly selects n elements from the domain
>>   ++ and returns a function "random" so that random() chooses randomly
>>   ++ among those n preselected elements. If the underlying domain does
>>   ++ not contain n elements then randomElement(n) returns failed.
>>
>> That would be something I understand as an implementor.
>> Implementationwise, that does (of course) not mean, that the call
>> "randomElement(n)" first computes a *list* of n elements. Such details
>> are left open.
> 
> I do not see why you want to return a function?  From my point
> of view this is just useless complication.  Also what
> "randomly selects n elements" means?  To have random choice
> we need a probability distribution and the whole point of
> RandomChoice is that we do not have any natural distribution.
> 
> Given that we do not want the intermediate function we would get the
> following:
> 
> ++ randomElement(n) first selects n distinct elements from the
> ++ domain and then randomly chooses one of them with uniform
> ++ distribution. If the underlying domain does
> ++ not contain n elements then randomElement(n) returns failed
> 
> I also wrote that elements should be distinct because otherwise
> one gets completely unnecessary complications trying to
> estimate probabilities.  However, this splitting into first
> choosing set and than selectiong element from the set does
> not look clearer for me than my formulation.  Note: having
> _both_ is certainly clearer than just one.  In my formulation
> reader must infer that logically there is selection of
> a set.  You put stress on selecting set, but than reader must
> infer that in fact this selection is only at logical level. 

> Your formulation is more precise (provided that we say that
> n elements are distinct). 

When I wrote "randomly selects n elements" then it means that those
elements are distinct. Otherwise it wouldn't be n elements. But yes,
adding "distinct" makes it clearer. One could probably even say "selects
at least n distinct elements", but I guess most implementors will anyway
just select exactly n elements.

Concerning my introduction of returning a random() function instead of
the elements themselves...

You mention that my split in preselecting elements and then randomly
choosing from that set does not look clearer. I still not only find that
clearer (especially when we agree on "at least n distinct elements"),
because

1) If the domain does not have n elements, one gets failed immediately
and the place in the probabilistic algorithm where this appears can
probably better handled if the "failed" appears just at the place where
the probabilistic algorithm expect an element.

2) The random() function returns an element from the domain and not
something of type Union(%, failed). So conversion would be avoided. I
expect that random() would be more often called than the function
randomElement(n) that returns the random function.

3) If one uses randomElement: Integer -> Union(%, failed), in each call,
the implementor has to make sure that an element is only chosen with
probability <= 1/n, but for complex domains it is probably hard to
ensure such a probability and it might be easier to have a way to store
some information in local variables as it would be possible with a
signature like Integer -> Union(()->%, failed).

4) With your signature... as an implementor I don't know whether the
callee just calls randomElement(4), randomElement(5), randomElement(6),
etc. would the implementor be allowed to return the same element for all
those calls? Exactly this case is confusing, if you insist on Integer ->
Union(%, failed) and was the reason why I introduced another indirection
level.

> BTW: I have now checked one argument 'random'.  We have two
> 
> random : PositiveInteger -> ...

Let's discuss this when it really becomes relevant, i.e. if RandomChoice
uses random instead of randomElement or randomSelect.

Maybe instead of randomElement I would rather use

  choose: PositiveInteger -> Union(()->%, failed)

My first feeling is that I wouldn't want to overload on the return type.

Ralf

-- 
You received this message because you are subscribed to the Google Groups 
"FriCAS - computer algebra system" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/fricas-devel.
For more options, visit https://groups.google.com/d/optout.

Re: [fricas-devel] Random choice

Reply via email to