On 23/10/2020 11:59, Oscar Benjamin wrote:
Hi Jason,

I'm approaching this from the perspective that this is a bug in
cse/lambdify and that you want to identify which part of a large
expression tree triggers the bug. Since evaluating the whole
expression is slow I would start by testing smaller subexpressions and
work up from there. You can get something like that with:

from sympy.utilities.iterables import postorder_traversal

def cse_compare(expr, syms, vals):
     result = lambdify(syms, expr)(*vals)

     syms_all, vals_all = list(syms), list(vals)
     reps, [reduced_expr] = cse(expr)
     for csesym, cseexpr in reps:
         val = lambdify(syms_all, cseexpr)(*vals_all)
         syms_all.append(csesym)
         vals_all.append(val)
     result_cse = lambdify(syms_all, reduced_expr)(*vals_all)

     return abs(result - result_cse) / abs(result)

def cse_compare_bottom_up(expr):
     syms = list(expr.free_symbols)
     vals = list(np.random.random(len(syms)))

     def subexprs():
         seen = set()
         for e in postorder_traversal(expr):
             if e not in seen:
                 yield e
                 seen.add(e)

     # possibly better:
     #exprs = sorted(set(postorder_traversal(expr)), key=lambda e: 
e.count_ops())
     exprs = subexprs()

     for subexpr in exprs:
         print()
         print(subexpr)
         print(cse_compare(expr, syms, vals))

test_expr = (sin(x+1) + 2*cos(x-3)).expand(trig=True)
cse_compare_bottom_up(test_expr)


Oscar

On Fri, 23 Oct 2020 at 07:59, Jason Moore <[email protected]> wrote:
Oscar,

Yes I can try evaluating in different orders. That's an interesting idea.

I also may be breaking lambdify with too many arguments. I can store the 
numerical results of each subexpression in a dictionary, then identify the 
minimal set of variables in each subsequent subexpression, and only lambdify 
with that very much smaller set of arguments. Maybe that will fix it.

I'll try both.

Jason
moorepants.info
+01 530-601-9791


On Fri, Oct 23, 2020 at 8:56 AM Oscar Gustafsson <[email protected]> 
wrote:
Yes, you are right that the email is about a different topic. However, it can be worth 
noting that evaluating floating-point expressions in different order, which is basically 
what you do using cse:s, can give different results due to the floating-point 
quantization/rounding happening at different places. Hence, it is not always obvious to 
tell if the results are "identical". If you are unlucky, the error propagation 
can be quite disastrous at times, but you are probably correct in that there is something 
else that is the primary issue here.

BR Oscar

Den tors 22 okt. 2020 kl 18:57 skrev Jason Moore <[email protected]>:
As far as I understand the results should match to machine precision.

We are able to match these kinds of models executed on different platforms 
using different formulation techniques to machine precision (e-14 or more). In 
fact, we use this to be able to verify that two independent modelers have 
created the same model.

The fact that my model can't get the same results with two numerical 
evaluations of sympy code is concerning.

My question in this email though, is how to improve the comparison code so that 
it doesn't crash for large expressions.

Jason
moorepants.info
+01 530-601-9791


On Thu, Oct 22, 2020 at 6:49 PM Oscar Gustafsson <[email protected]> 
wrote:
How much do they differ? I checked the issue and the latest result wasn't that 
bad, although probably too much to blame floating-point errors. However, if you 
push the numerical range into e.g. denormalized numbers I guess it can simply 
be that. Not very likely though. Is it still that type of numerical errors?

BR Oscar

Den tors 22 okt. 2020 09:37Jason Moore <[email protected]> skrev:
Howdy,

I'm trying to track down a bug in this PR:

https://github.com/pydy/pydy/pull/122

I have quite large SymPy expressions which I evaluate with floating point 
numbers. I also cse those expressions and evaluate those with the same floating 
point numbers. I'm getting discrepancies in the results. I'd like to eliminate 
the possibility that the cse'd expressions are different from the original 
expression. I wrote this naive function to make the comparison:

def compare_cse(expr):

     args = list(expr.free_symbols)
     args.remove(TIME)
     args += me.find_dynamicsymbols(expr)
     args = list(sm.ordered(args))
     vals = list(np.random.random(len(args)))

     eval_expr = sm.lambdify(args, expr)
     res = eval_expr(*vals)

     replacements, reduced_expr = sm.cse(expr)
     for repl in replacements:
         var = repl[0]
         sub_expr = repl[1]
         eval_sub_expr = sm.lambdify(args, sub_expr)
         vals.append(eval_sub_expr(*vals))
         args.append(var)
     eval_reduced_expr = sm.lambdify(args, reduced_expr[0])
     cse_res = eval_reduced_expr(*vals)

     np.testing.assert_allclose(res, cse_res)

This seems to work for small expressions, but for the large expressions run 
this actually kills my Python interpreter after some minutes.

Is there a better way to write this? I need some way to verify that numerical 
evaluation of a cse'd expression gives the same result as numerical evaluation 
of the expression that can actually complete in a reasonable amount of time.

Jason
moorepants.info
+01 530-601-9791

--
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAP7f1Aivs53AcwyaoyO5%2Bzkn%2BPrX3TfoJJ5X85qQNZ6JRbo5JQ%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAFjzj-LQNBFoF7Q4Rb%2BZG3J35-Et9b__SED-DMJvwUYqyKYADQ%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAP7f1AgdkaC0D69s_ohka_AjDFvKCD%2BMoVKyFhs0Vn2b_DtFkw%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAFjzj-LFUSgmQ90Uok2LxKSTZ-bJKysCy%2BGqvs1wYtDErUAZNA%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAP7f1Ai1UbiR6vLZs7_PTmY2WOk-VpsqYTSsTm4KUi6e9Y78qQ%40mail.gmail.com.

I am a bit puzzled that anyone would perform algebraic manipulations on an expression involving floating point numbers. Surely this is asking for trouble, I mean the sequence a+b+c need not equal a+c+b if a,b,c are floating point numbers. I would have thought it would always be better to replace the floating point numbers by variables, perform the operation, and then substitute the values back into the expression?

This can easily happen without creating denormals, etc.

David

--
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/c7ad75e0-b24d-622b-5c3c-bc09033b9b6a%40dbailey.co.uk.

Reply via email to