H,

I have a local version that does the test for the 'errorvalue' that gets returned when stack unwinding is going on. I did not check this in yet because it did not give a speedup, I will keep it around and look into a little further. We probably are making life difficult for the compiler because of the use of the resume_label. What we IMO really should to do is tell the compiler not to bother with the label and let all the register setup be handled in the resume block. In other words we need to make it clearer (for the compiler) that non stack unwind/resume is the case to optimize. Now the question is of course, how to do that?

cheers
Eric


On Feb 22, 2006, at 9:22 PM, [EMAIL PROTECTED] wrote:

Author: tismer
Date: Wed Feb 22 21:22:46 2006
New Revision: 23599

Modified:
   pypy/dist/pypy/translator/c/src/ll_stackless.h
Log:
a tiny change to stackless switching. This is still optimizable by using special return values (please talk with me when considering). Anyway the effects of these changes are still rather unpredictably.

Modified: pypy/dist/pypy/translator/c/src/ll_stackless.h
====================================================================== ========
--- pypy/dist/pypy/translator/c/src/ll_stackless.h      (original)
+++ pypy/dist/pypy/translator/c/src/ll_stackless.h Wed Feb 22 21:22:46 2006
@@ -26,14 +26,33 @@

 #define RPyExceptionClear()       rpython_exc_type = NULL

+/*
#define StacklessUnwindAndRPyExceptionHandling(unwind_label, resume_label, exception_label) \
             if (RPyExceptionOccurred()) {   \
                 if (slp_frame_stack_bottom) \
                     goto unwind_label;      \
-            resume_label:                   \
+          resume_label:                   \
                 if (RPyExceptionOccurred()) \
                     FAIL(exception_label);  \
             }
+
+ Following code was supposed to compiler to shorter machine code, but on windows it doesn't. + Probably some other code folding is prevented, and there is a tiny increase of 20 kb. + I'm leaving the change in here, anyway. Richards is getting a bit slower, PySone
+    is getting faster, all in all speed is slightly increased.
+ We should further investigate and try to use Eric's suggestion of checking certain
+    return values to get even shorter code paths.
+ In any case, these optimizations are still flaky, because we are probably in a high + noise level of caching effects and random decisions of the compiler.
+*/
+#define StacklessUnwindAndRPyExceptionHandling(unwind_label, resume_label, exception_label) \
+          resume_label:                   \
+            if (RPyExceptionOccurred()) {   \
+                if (slp_frame_stack_bottom) \
+                    goto unwind_label;      \
+                FAIL(exception_label);      \
+            }
+*/
 #else

 #define RPyRaisePseudoException()
_______________________________________________
pypy-svn mailing list
[EMAIL PROTECTED]
http://codespeak.net/mailman/listinfo/pypy-svn

_______________________________________________
[email protected]
http://codespeak.net/mailman/listinfo/pypy-dev

Reply via email to