Comment #2 on issue 3129 by [email protected]: Drastic change to
sympy.stats: Adding concept of Probability Distributions on surface level
http://code.google.com/p/sympy/issues/detail?id=3129
The current sugarless way of creating a random symbol is something like this
BinomA = BinomialPSpace(1, S.Half, symbol=Symbol('X'))
X = RandomSymbol(BinomA, Symbol('X'))
We add the function "Binomial" as syntactic sugar
X = Binomial(1, S.Half, symbol=Symbol('X'))
or
X = Binomial(1, S.Half)
If a symbol is not specified then a default one is auto-generated.
It looks like one of the things you're proposing to replace PSpace with
Probability Distribution and clean it up so that it is user-visible. I'm
generally happy with this. There is a lot we can do to clean up the
internals of sympy.stats and this might be one of them. It would be great
to have a second developer go over the code in depth and provide a second
perspective.
I'm much more hesitant to affect the interface however. I think that
single-line random variable creation is important. It allows introductory
users to jump into sympy.stats much more quickly. I like auto-generation of
symbols if not provided (this makes the code look more like math) but I'm
not going to fight very hard for it.
Some comments on your bullet points
To summarize:
- Add the concept of ProbabilityDistribution to sympy.stats
* I think this should only be done if it replaces ProbabilitySpace
- Distributions are static objects: they carry information like mean,
variance, pdf, and two distributions are equal if they have the same
parameters
* You'll have to be careful about depending on this information. You'll end
up creating lots of compound distributions when doing statistical
manipulations and you won't have the mean, variance, pdf, etc... for these
compound distributions a priori.
Benefits:
- Get rid of redundancy of creating a class for type of PSpace and then a
function to get the random variable of that PSpace.
* The functions are just there for syntactic sugar. I would suggest this
sugar in either case.
- Explicitly creating symbol names, no more default symbols with increasing
numbers
* This is a different issue I think. This can be addressed in either
system. I.e. we could ask the user to type in "X = Normal(0, 1, 'X')" in
the current system.
- Unambiguous creation of new random variables
* Can you expand upon this?
Drawbacks:
- More verbose to create a new RV
* We could add sugar to solve this.
- May be seen as complicating the already complicated sympy.stats class
hierarchy
* I'm pretty confident that the second go around you would end up reducing
the complexity, not increasing it.
What I would do if this were entirely up to me:
-- Keep the current interface with functions Normal, Binomial, etc....
Require an explicit letter on creation. I.e.
X = Normal(0, 1, 'X')
I believe that either this or the current way, "X = Normal(0, 1)", is the
right way to do random variable creation. It matches mathematical tradition.
-- Release 0.72
-- Work on internals
-- Decide later if we want to allow the syntax "X = Normal(0, 1)" It's much
easier to decide later to allow this syntax than disallow it.
This allows us to think about this problem over an extended period without
blocking the release. The API for the internals can be released with 0.73.
I.e. we expose only the sugar for the moment to buy us time. I think it
will be easier for us to come up with a plan for the interface than for the
internals.
In any event I encourage you to play with this idea. I think that a lot of
good can come out of it.
--
You received this message because you are subscribed to the Google Groups
"sympy-issues" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sympy-issues?hl=en.