Re: Issue 3129 in sympy: Drastic change to sympy.stats: Adding concept of Probability Distributions on surface level

sympy Mon, 05 Mar 2012 16:02:48 -0800

Comment #2 on issue 3129 by [email protected]: Drastic change tosympy.stats: Adding concept of Probability Distributions on surface level

http://code.google.com/p/sympy/issues/detail?id=3129


The current sugarless way of creating a random symbol is something like this

BinomA = BinomialPSpace(1, S.Half, symbol=Symbol('X'))
X = RandomSymbol(BinomA, Symbol('X'))


We add the function "Binomial" as syntactic sugar

X = Binomial(1, S.Half, symbol=Symbol('X'))
or
X = Binomial(1, S.Half)

If a symbol is not specified then a default one is auto-generated.

It looks like one of the things you're proposing to replace PSpace withProbability Distribution and clean it up so that it is user-visible. I'mgenerally happy with this. There is a lot we can do to clean up theinternals of sympy.stats and this might be one of them. It would be greatto have a second developer go over the code in depth and provide a secondperspective.

I'm much more hesitant to affect the interface however. I think thatsingle-line random variable creation is important. It allows introductoryusers to jump into sympy.stats much more quickly. I like auto-generation ofsymbols if not provided (this makes the code look more like math) but I'mnot going to fight very hard for it.


Some comments on your bullet points

To summarize:
- Add the concept of ProbabilityDistribution to sympy.stats
* I think this should only be done if it replaces ProbabilitySpace

- Distributions are static objects: they carry information like mean,variance, pdf, and two distributions are equal if they have the sameparameters* You'll have to be careful about depending on this information. You'll endup creating lots of compound distributions when doing statisticalmanipulations and you won't have the mean, variance, pdf, etc... for thesecompound distributions a priori.


Benefits:

- Get rid of redundancy of creating a class for type of PSpace and then afunction to get the random variable of that PSpace.* The functions are just there for syntactic sugar. I would suggest thissugar in either case.- Explicitly creating symbol names, no more default symbols with increasingnumbers* This is a different issue I think. This can be addressed in eithersystem. I.e. we could ask the user to type in "X = Normal(0, 1, 'X')" inthe current system.

- Unambiguous creation of new random variables
* Can you expand upon this?

Drawbacks:
- More verbose to create a new RV
* We could add sugar to solve this.

- May be seen as complicating the already complicated sympy.stats classhierarchy* I'm pretty confident that the second go around you would end up reducingthe complexity, not increasing it.



What I would do if this were entirely up to me:

-- Keep the current interface with functions Normal, Binomial, etc....Require an explicit letter on creation. I.e.

X = Normal(0, 1, 'X')

I believe that either this or the current way, "X = Normal(0, 1)", is theright way to do random variable creation. It matches mathematical tradition.

-- Release 0.72
-- Work on internals

-- Decide later if we want to allow the syntax "X = Normal(0, 1)" It's mucheasier to decide later to allow this syntax than disallow it.

This allows us to think about this problem over an extended period withoutblocking the release. The API for the internals can be released with 0.73.I.e. we expose only the sugar for the moment to buy us time. I think itwill be easier for us to come up with a plan for the interface than for theinternals.

In any event I encourage you to play with this idea. I think that a lot ofgood can come out of it.


--
You received this message because you are subscribed to the Google Groups 
"sympy-issues" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sympy-issues?hl=en.

Re: Issue 3129 in sympy: Drastic change to sympy.stats: Adding concept of Probability Distributions on surface level

Reply via email to