Thank you Aaron for taking the time to reply to my email. I appreciate your input.
I wrote a draft of my GSOC proposal <https://drive.google.com/file/d/1CmHKtQ3DZYe-PH878o2bijJAtY8j1ahe/view>. Could you (or anyone else) give me feedback? Where is my understanding wrong and which parts should I expand upon? Particularly for the timeline section I'm not sure what to put. I still need to do a lot of research and I'm not sure how long things will take. Also, a question: why does SymPy implement its own SAT solver instead of using a SAT/SMT solver such as Z3 <https://en.wikipedia.org/wiki/Z3_Theorem_Prover>. Although Z3 is associated with Microsoft, it is licensed under the MIT License and has a python wrapper. Wouldn't using something like Z3 which is super optimized be a lot faster than writing our own SAT solver? On Wednesday, March 29, 2023 at 12:26:47 AM UTC-7 [email protected] wrote: > On Wed, Mar 29, 2023 at 12:21 AM Tilo RC <[email protected]> wrote: > >> Thank you for your response Oscar. I think I’m interested in taking a >> stab at improving the capabilities of the new assumption system to deal >> with inequalities (and perhaps relations between symbols in general?). I’ve >> spent a lot of time reading through the documentation, code, discussions, >> and previous proposals related to assumptions and come to some >> understanding. I’ve tried to explain what I understand so far below. Could >> you read through what I’ve written and correct any mistakes in my thinking? >> >> For around a decade the SymPy community has wanted to transition from the >> “old assumptions” to the “new assumptions.” The old assumptions are seen as >> fundamentally limited compared to the new assumptions for reasons I’m >> trying to figure out. This is my understanding: >> >> - >> >> The old assumption system is deeply connected to the core of sympy. I >> think this is bad because it’s responsible for some of the other problems >> with it and violates OOP principles. >> >> > It's not bad because of some abstract reasons. The issue is that the old > assumptions design limits what it is able to do. Assumptions can only be > made on symbols, deductions can only be done via direct chains of inference > from handlers, and it's impossible to add additional facts to the system > except by adding handlers to the specific relevant classes. > > Just as an example, inequality assumptions like x > 1 are simply > impossible to even represent in the old assumptions. > > >> >> - >> >> One fundamental flaw of the old assumption system is that it’s not >> possible to make assumptions about relations between different variables. >> The new assumption system has this capability, and even though it is not >> very feature rich, it has a lot of potential. >> >> > This is correct. > > >> >> - >> >> The old assumption system slows down SymPy. I’m not sure how true >> this still is. It seems like the core and the old assumption system have >> gotten many improvements to make them faster. For example, because I >> accidentally read the SymPy 1.11 >> >> <https://docs.sympy.org/latest/guides/assumptions.html#mechanism-of-the-assumptions-system> >> >> instead of the SymPy 1.13 >> >> <https://docs.sympy.org/dev/guides/assumptions.html#mechanism-of-the-assumptions-system> >> >> documentation I learned about the ManagedProperties metaclass and then >> saw >> it was removed to improve efficiency. (By the way, both of these sections >> from the documentation say that “This explanation is written as of SymPy >> 1.7.” even though the writing clearly changed between 1.11 and 1.13). >> >> > This is still true. The issue isn't just the speed of the old assumptions, > but the fact that object constructors make heavy use of them. Take the > expression in https://github.com/sympy/sympy/issues/24565, which takes 30 > seconds just to construct. This time is all spent in assumptions. Ideally, > simply constructing an expression shouldn't call any assumptions. One of > the reasons the new assumptions are separate from the core is that there > was a hope that this would make it easier to remove assumptions from core > constructors (but really that isn't necessary; we just need to be better > about not using assumptions in constructors). > > >> >> As for improving how the new assumptions deal with inequalities, there >> are some very basic features they’re missing. For example, >> ask(Q.negative(x),Q.ge(x,1)) returns None under the current system even >> though it seems like it would be really simple to implement something where >> if x > 0 then x is not negative. Similarly, ask(Q.gt(x,0),Q.gt(x,1)) >> returns None. A more complicated query that returns None is ask(Q.eq(x,1), >> Q.integer(x) & Q.gt(x,0) & Q.lt(x,2)). >> >> I also notice that even equality is somewhat limited in the new >> assumption system. For example, ask(Q.positive(y), Q.eq(x,y) & >> Q.positive(x)) returns None. I’m a bit unsure why this hasn’t been >> implemented. One guess I have is that implementing this functionality would >> be too slow because maybe you would have to use the solver? >> > > No, it just hasn't been implemented. The idea here is that we need a way > to rewrite these predicates and/or generate new predicates that will allow > the assumptions system SAT solver to deduce things about the symbols. It > needs to be done efficiently. There is likely literature on ways to do > this. For instance, if we have Eq(x, y) then we could add a predicate > Equivalent(p(x), p(y)) for every predicate p(x) and p(y), but there could > be a lot of such predicates. There's likely smarter ways to represent this > substitution system. I'm sure there's also good heuristics too (like just > substituting y for x everywhere). > > In general, we may need to move from a SAT solver to a SMT solver approach > to handle some of these things completely efficiently. > > >> Frankly, although I’ve spent a lot of time reading the files in the new >> assumption system, I only have a surface level understanding of them: >> >> >> - >> >> relation/equality.py >> - >> >> I probably understand this file the best out of all the files in >> the new assumptions. It contains implementations of binary relation >> for >, >> ≥, ≤, <, =, and ≠. >> - >> >> Ask.py >> - >> >> Heart of the new assumptions. Defines ask() and Q. In general, ask >> tries to find computationally cheap ways to evaluate propositions. If >> it >> can’t do it in a cheap way it calls the SAT solver. Also, all of the >> assumptions are converted into a normal form which I think is >> analogous to >> the role Chomsky normal form plays in allowing a parser to parse a >> grammar >> efficiently. >> >> >> How difficult would it be to implement something that deals with >> inequalities better? What about the specific cases I outline above? Would >> this be enough for a summer project? Do you think this task to advanced for >> someone of my background? >> > > You'll need to do more research, but if you are motivated it's not > impossible to do. You should first get a better understanding of the basics > of SAT and concepts like CNF and satisfiability. Understanding the inner > workings of a SAT solver is also useful but not as important, at least for > a first pass. For the most part, you can treat a SAT solver as a black box. > > It also would be a good idea to look at the literature to see if you can > find if anyone has written any papers on how to do this. This sort of thing > is already doable by SMT solvers so it's not completely novel, but there > may be simpler approaches when dealing with pure inequalities. > > The files you want to get an understanding of are satask and sathandlers. > In particular, if you can grok the basic way that satask works in how it > gets answers from the SAT solver, that is a good first step. > > Aaron Meurer > > On Tuesday, March 21, 2023 at 1:23:12 PM UTC-7 Oscar wrote: >> >>> On Tue, 21 Mar 2023 at 09:02, Tilo RC <[email protected]> wrote: >>> > >>> > Hello! My name is Tilo and I am a sophomore at Pomona College in >>> Claremont, California. Currently, I am double majoring in math and cs. I am >>> highly interested in participating in GSoC with sympy because I believe it >>> would be a great opportunity to enhance my programming skills and give back >>> to the open-source community. >>> >>> Hi Tilo, >>> >>> > I am uncertain about which GSoC Ideas would suit me best, given my >>> background. However, I have some tentative ideas: >>> > >>> > Parsing >>> > >>> > Currently, I am enrolled in an introductory course on languages and >>> the theory of computation, where we have recently started exploring >>> parsers. Additionally, I am taking a natural language processing class that >>> involves programming intensive assignments on probabilistic context-free >>> grammars and sentence parsing. However, I have limited experience in some >>> of the potentially required languages such as Fortran, C, C++, Julia, Rust, >>> LLVM, Octave, and Matlab. I think a project that focuses on existing LaTeX >>> functionality would be a good fit. >>> >>> I think that what really needs to happen with parsing is to rewrite >>> the parsers based on something like lark. The current parsing code >>> either uses ad-hoc parsers or in the case of parse_expr uses eval. I >>> think that making it so that parse_expr does not use eval should be a >>> top priority in parsing. >>> >>> > Assumptions >>> > >>> > This idea seems very challenging but it also interests me a lot. >>> However, a bug I found today made me want to learn more about how sympy’s >>> assumption system works. It’s not exactly clear to me what the prereqs for >>> working on this idea are. However, I have experience with number theory >>> which is listed as one of the prereqs for some reason. Also, I’ve taken a >>> class on functional programing with coq which seems like it could possibly >>> be relevant. >>> >>> Assumptions are a big topic. Making it so that the new assumptions >>> system can understand inequalities meaningfully would make a good >>> project. That is a very common feature request. I don't think number >>> theory is relevant. >>> >>> > Improve the plotting module >>> > >>> > If the other ideas seem unrealistic or impractical, this project seems >>> well-suited to my capabilities. I have experience with HTML, Javascript, >>> and CSS. (Actually, JavaScript was the first language I learned. I even >>> taught a lesson on using JS to approximate integrals in my high school >>> calculus class. Hey! Python would’ve been better but JavaScript worked!). >>> While working on an issue related to polygons I noticed that there was no >>> convenient way to plot polygons and other geometric objects, so maybe some >>> of the work on this idea could add functionality related to that. >>> >>> Significant improvements have been made to the plotting module through >>> sympy_plot_backends: >>> https://sympy-plot-backends.readthedocs.io/en/latest/ >>> >>> Integrating that improved plotting module back into SymPy would make a >>> great project. There are some (I think small) difficulties in doing >>> this in a way that is backwards compatible so it needs a bit of work: >>> https://github.com/sympy/sympy/issues/23036 >>> >>> -- >>> Oscar >>> >>> >>> > >>> > I would appreciate your opinion on which of these ideas to explore >>> further, and whether there are any better-suited to my background. >>> > >>> > -- >>> > You received this message because you are subscribed to the Google >>> Groups "sympy" group. >>> > To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> > To view this discussion on the web visit >>> https://groups.google.com/d/msgid/sympy/c4b9de3d-40e8-40e6-8546-c8fff11db9ban%40googlegroups.com. >>> >>> >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "sympy" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/sympy/96242267-2259-400e-a529-42c20994e2c9n%40googlegroups.com >> >> <https://groups.google.com/d/msgid/sympy/96242267-2259-400e-a529-42c20994e2c9n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "sympy" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/18580097-c93b-4b09-b5ec-820017bbe603n%40googlegroups.com.
