You're right, my code seems to be fast, because it didn't calculate
(almost) anything, yet. When you calculate s2[10][10][10][10], it
calculates only what it needs for that, meaning the coefficients with
i,j,k,l<=10. That's not nearly half of them, it's about 1/256, so my code
is in reality *much* less performant than yours.
Speaking of all coefficients: since it's a polynomial of 4 variables, and
maximum degree 40 in every one of them, the total would be 41^4=2825761.
True, those with i+j+k+l>40 are 0, but so are all with i+j+k+l not in {0,
20, 40}.
I don't know why your code gave the result 4.7053608710735743e21. Since the
exact result is 40!/10!^4=4705360871073570227520, obviously,
4.70536087107357e21 is a bit more precise.
Well, I've explained already why I think multivariate Taylor series of
higher degree are not a good idea, and the performance of my code seems to
be especially bad in this case. I've mentioned that already in the
bivariate case,
evaluate(c,70,0.2,0.3)
takes minutes, and I was speaking of almost 10 minutes! So I'll use that
only in the univariate case, or limit it to really small degrees.