Comment #2 on issue 3129 by [email protected]: Drastic change to sympy.stats: Adding concept of Probability Distributions on surface level
http://code.google.com/p/sympy/issues/detail?id=3129

The current sugarless way of creating a random symbol is something like this

BinomA = BinomialPSpace(1, S.Half, symbol=Symbol('X'))
X = RandomSymbol(BinomA, Symbol('X'))

We add the function "Binomial" as syntactic sugar

X = Binomial(1, S.Half, symbol=Symbol('X'))
or
X = Binomial(1, S.Half)

If a symbol is not specified then a default one is auto-generated.

It looks like one of the things you're proposing to replace PSpace with Probability Distribution and clean it up so that it is user-visible. I'm generally happy with this. There is a lot we can do to clean up the internals of sympy.stats and this might be one of them. It would be great to have a second developer go over the code in depth and provide a second perspective.

I'm much more hesitant to affect the interface however. I think that single-line random variable creation is important. It allows introductory users to jump into sympy.stats much more quickly. I like auto-generation of symbols if not provided (this makes the code look more like math) but I'm not going to fight very hard for it.

Some comments on your bullet points

To summarize:
- Add the concept of ProbabilityDistribution to sympy.stats
* I think this should only be done if it replaces ProbabilitySpace
- Distributions are static objects: they carry information like mean, variance, pdf, and two distributions are equal if they have the same parameters * You'll have to be careful about depending on this information. You'll end up creating lots of compound distributions when doing statistical manipulations and you won't have the mean, variance, pdf, etc... for these compound distributions a priori.

Benefits:
- Get rid of redundancy of creating a class for type of PSpace and then a function to get the random variable of that PSpace. * The functions are just there for syntactic sugar. I would suggest this sugar in either case. - Explicitly creating symbol names, no more default symbols with increasing numbers * This is a different issue I think. This can be addressed in either system. I.e. we could ask the user to type in "X = Normal(0, 1, 'X')" in the current system.
- Unambiguous creation of new random variables
* Can you expand upon this?

Drawbacks:
- More verbose to create a new RV
* We could add sugar to solve this.
- May be seen as complicating the already complicated sympy.stats class hierarchy * I'm pretty confident that the second go around you would end up reducing the complexity, not increasing it.


What I would do if this were entirely up to me:

-- Keep the current interface with functions Normal, Binomial, etc.... Require an explicit letter on creation. I.e.
X = Normal(0, 1, 'X')
I believe that either this or the current way, "X = Normal(0, 1)", is the right way to do random variable creation. It matches mathematical tradition.
-- Release 0.72
-- Work on internals
-- Decide later if we want to allow the syntax "X = Normal(0, 1)" It's much easier to decide later to allow this syntax than disallow it.

This allows us to think about this problem over an extended period without blocking the release. The API for the internals can be released with 0.73. I.e. we expose only the sugar for the moment to buy us time. I think it will be easier for us to come up with a plan for the interface than for the internals.

In any event I encourage you to play with this idea. I think that a lot of good can come out of it.

--
You received this message because you are subscribed to the Google Groups 
"sympy-issues" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sympy-issues?hl=en.

Reply via email to