On Sat, Nov 21, 2020 at 10:41:31PM +0800, Qian Yun wrote: > OK, here is some of my preliminary findings of this deep > learning system (I'll refer as DL): > > (The test are done with a beam size of 10, which means this DL > will give 10 answers that it deems most likely.) > (And I use the official FWD + BWD + IBP trained model.) > > 1. It doesn't handle large numbers very well.
The paper said that for training they used numbers up to 5 in random expressions. Differentiation and arthemtic simplification may produce larger numbers, but clearly large number go beyond training set. Also, in the past there were works suggesting that arithmetic is hard for ANN-s. OTOH we do not need DL for arithmetic and IMO use of DL for integration is mostly independent of arithmetic. > > For example, to integrate "x**1678", its answers are > > -0.05505 NO x**169/169 > -0.08730 NO x**1685/1685 > -0.10008 NO x**1681/1681 > -0.21394 NO x**1689/1689 > -0.25264 NO x**1687/1687 > -0.25288 NO x**1688/1681 > -0.28164 NO x**1678/1678 > -0.28320 NO x**1678/1679 > -0.29745 NO x**1684/1684 > -0.31267 NO x**1678/1685 > > This example is testing DL's understanding of pattern > "integration of x^n is x^(n+1)/(n+1)". > > This result seems to show that DL understands the pattern but fails to > do "n+1" for some not so large n. > > 2. DL may give correct result that contains strange constant. > > For example, to integrate "x**2", its answers are > > -0.25162 OK x**3*(1/cos(2) + 1)/(6*(1/(2*cos(2)) + 1/2)) > -0.25220 OK x**3*(1 + 1/cos(1))/(6*(1/2 + 1/(2*cos(1)))) > -0.25304 OK x**3*(1 + 1/sin(2))/(6*(1/2 + 1/(2*sin(2)))) > -0.25324 OK x**3*(1 + 1/sin(1))/(6*(1/2 + 1/(2*sin(1)))) > -0.25458 OK x**3*(1/tan(1) + 1)/(15*(1/(5*tan(1)) + 1/5)) > -0.25508 OK x**3*(1 + log(1024))/(15*(1/5 + log(1024)/5)) > -0.25525 OK x**3*(1/tan(2) + 1)/(15*(1/(5*tan(2)) + 1/5)) > -0.25647 OK x**3*(1 + 1/cos(1))/(15*(1/5 + 1/(5*cos(1)))) > -0.25774 OK x**3*(1 + 1/sin(1))/(15*(1/5 + 1/(5*sin(1)))) > -0.28240 OK x**3*(log(2) + 1)/(15*(log(2)/5 + 1/5)) > > 3. DL doesn't understand multiplication very well. > > For example, to integrate "19*sin(x/17)", its answers are > > -0.12595 NO -365*cos(x/17) > -0.12882 NO -373*cos(x/17) > -0.14267 NO -361*cos(x/17) > -0.14314 NO -357*cos(x/17) > -0.18328 NO -353*cos(x/17) > -0.20499 NO -377*cos(x/17) > -0.21484 NO -352*cos(x/17) > -0.25740 NO -369*cos(x/17) > -0.26029 NO -359*cos(x/17) > -0.26188 NO -333*cos(x/17) > > 4. DL doesn't handle long expression very well. > > For example to integrate > 'sin(x)+cos(x)+exp(x)+log(x)+tan(x)+atan(x)+acos(x)+asin(x)' > its answers are > > -0.00262 NO x*log(x) + x*acos(x) + x*asin(x) - x + exp(x) - log(x**2 +=20 > 1)/2 + sin(x) - cos(x) > -0.07420 NO x*log(x) + x*acos(x) + x*asin(x) - x + exp(x) - log(x**2 +=20 > 1)/2 + log(cos(x)) + sin(x) - cos(x) > -0.10192 NO x*log(x) + x*acos(x) + x*asin(x) - x + exp(x) - log(x**2 +=20 > 1)/2 + 2*sin(x) - cos(x) > -0.10513 NO x*log(x) + x*acos(x) + x*asin(x) + exp(x) - log(x**2 +=20 > 1)/2 - log(cos(x)) + sin(x) - cos(x) > -0.10885 NO x*log(x) + x*sin(x) + x*acos(x) + x*asin(x) - x + exp(x) -=20 > log(x**2 + 1)/2 + sin(x) - cos(x) > -0.10947 NO x*log(x) + x*acos(x) + x*asin(x) - x + exp(x) - log(x**2 +=20 > 1)/2 + sin(x) - cos(x) > -0.13657 NO x*log(x) + x*acos(x) + x*asin(x) - x + exp(x) - log(x**2 +=20 > 1)/2 + log(exp(x) + 1) + sin(x) - cos(x) > -0.16144 NO x*log(x) + x*acos(x) + x*asin(x) - x + exp(x) + log(x + 1)=20 > - log(x**2 + 1)/2 + sin(x) - cos(x) > -0.16806 NO x*log(x) + x*acos(x) + x*asin(x) - x + exp(x) - log(x**2 +=20 > 1)/2 + log(cos(x)) + sin(x) - cos(x) > -0.19019 NO x*log(x) + x*acos(x) + x*asin(x) - x + exp(x) - log(x**2 +=20 > 1)/2 + log(exp(asinh(x)) + 1) + sin(x) - cos(x) > > > 5. For the FWD test set with 9986 integrals, (which is generate random > expression first, then try to solve with sympy and discard failures) > FriCAS can solve 9980 out of 9986 in 71 seconds, of the remaining 6 > integrals, FriCAS can solve another 2 under 100 seconds, and gives > "implementation incomplete" for 2 integrals, and the remaining 2 > integrals contain complex constant like "acos(acos(tan(3)))", which > FriCAS can solve using another function. > > The DL system can solve 95.6%, by comparison FriCAS is over 99.94%. > > 6. The DL system is slow. To solve the FWD test set, the DL system > may use around 100 hours of CPU time. You mean 10000 examples? That would be average 36 seconds per example... IIUC you run on CPU, they probably got much shorter runtime on GPU. > 7. For the BWD test set, (which is generate random expression first, > then take derivatives as integrand), FriCAS can roughly solve 95%. > Compared with DL's claimed 99.5%. The paper says Mathematica can > solve 84.0%, I'll a little skeptical about that. I posted here generator that attemped to match parameters to the DL paper and got 78% success rate. That discoverd few bugs and percentage should be higher now, but much lower than 95%. So apparently they used easier examples (several details in the paper were rather unclear and I had to use my guesses). I wonder how well DL would do on examples from my generator? In particular, the paper does not mention simplification of examples. Unsimplified derivatives tend to contain visible traces of primitive, after simplification problem gets harder. > 8. DL doesn't handle rational function integration very well. > > It can handle '(x+1)^2/((x+1)^6+1)' but not its expanded form. > > So DL can recognize patterns, but it really doesn't have insight. > > Rational function integration can be well handled by > Lazard-Rioboo-Trager algorithm, while DL falis at many > rational function integrals. > > So some of my comments a year ago are correct: > > " > In fact, I doubt that this program can solve some rational function > integration that requires Lazard-Rioboo-Trager algorithm to get > simplified result. > " > > 9. DL doesn't handle algebraic function integration very well. > > I have a list of algebraic functions that FriCAS can solve while > other CASs can't, DL can't solve them as well. > > 10. For the harder mixed-cased integration, I have a list of > integrations that FriCAS can't handle, DL can't solve them as well. > > - Best, > - Qian > > On 11/16/20 8:34 PM, Qian Yun wrote: > > Hi guys, > > =20 > > I assume you all know the paper "DEEP LEARNING FOR SYMBOLIC MATHEMATICS" > > by facebook AI researchers, almost one year ago, posted on > > https://arxiv.org/abs/1912.01412 > > =20 > > And the code was posted 8 months ago: > > https://github.com/facebookresearch/SymbolicMathematics > > =20 > > Have you played with it? > > =20 > > Finally I have some time recently and played with it for a while, > > and I believe I found some flaws. I will post my findings with > > more details later. And it's a really interesting experience. > > =20 > > If you have some spare time and want to have fun, I strongly > > advise you to play with it and try to break it :-) > > =20 > > Tips: to run the jupyter notebook example, apply the following > > patch to run it on CPU instead of CUDA: > > =20 > > - Best, > > - Qian > > =20 > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > =20 > > diff --git a/beam_integration.ipynb b/beam_integration.ipynb > > index f9ef329..00754e3 100644 > > --- a/beam_integration.ipynb > > +++ b/beam_integration.ipynb > > @@ -64,6 +64,6 @@ > > =C2=A0=C2=A0=C2=A0=C2=A0 "\n", > > =C2=A0=C2=A0=C2=A0=C2=A0 "=C2=A0=C2=A0=C2=A0 # model parameters\n", > > -=C2=A0=C2=A0=C2=A0 "=C2=A0=C2=A0=C2=A0 'cpu': False,\n", > > +=C2=A0=C2=A0=C2=A0 "=C2=A0=C2=A0=C2=A0 'cpu': True,\n", > > =C2=A0=C2=A0=C2=A0=C2=A0 "=C2=A0=C2=A0=C2=A0 'emb_dim': 1024,\n", > > =C2=A0=C2=A0=C2=A0=C2=A0 "=C2=A0=C2=A0=C2=A0 'n_enc_layers': 6,\n", > > =C2=A0=C2=A0=C2=A0=C2=A0 "=C2=A0=C2=A0=C2=A0 'n_dec_layers': 6,\n", > > diff --git a/src/model/__init__.py b/src/model/__init__.py > > index 2b0a044..73ec446 100644 > > --- a/src/model/__init__.py > > +++ b/src/model/__init__.py > > @@ -38,7 +38,7 @@ > > =C2=A0=C2=A0=C2=A0=C2=A0 # reload pretrained modules > > =C2=A0=C2=A0=C2=A0=C2=A0 if params.reload_model !=3D '': > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 logger.info(f"Reloading= > modules from {params.reload_model} ...") > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 reloaded =3D torch.load(param= > s.reload_model) > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 reloaded =3D torch.load(param= > s.reload_model,=20 > > map_location=3Dtorch.device('cpu')) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 for k, v in modules.ite= > ms(): > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= > assert k in reloaded > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= > if all([k2.startswith('module.') for k2 in=20 > > reloaded[k].keys()]): > > diff --git a/src/utils.py b/src/utils.py > > index bd90608..ef87582 100644 > > --- a/src/utils.py > > +++ b/src/utils.py > > @@ -25,7 +25,7 @@ > > =C2=A0FALSY_STRINGS =3D {'off', 'false', '0'} > > =C2=A0TRUTHY_STRINGS =3D {'on', 'true', '1'} > > =20 > > -CUDA =3D True > > +CUDA =3D False > > =20 > > =20 > > =C2=A0class AttrDict(dict): > > --=20 > You received this message because you are subscribed to the Google Groups "= > FriCAS - computer algebra system" group. > To unsubscribe from this group and stop receiving emails from it, send an e= > mail to [email protected]. > To view this discussion on the web visit https://groups.google.com/d/msgid/= > fricas-devel/a6e0c843-fdcc-6b1c-3159-21499d7c05b9%40gmail.com. -- Waldek Hebisch -- You received this message because you are subscribed to the Google Groups "FriCAS - computer algebra system" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/fricas-devel/20201125150419.GA29555%40math.uni.wroc.pl.
