Sorry, I just read patch improperly. It's fine. Thanks for perseverance. Dmitry.
On Tue, Dec 10, 2013 at 4:06 PM, Ard Biesheuvel <ard.biesheu...@linaro.org>wrote: > On 10 December 2013 12:57, Dmitry Stogov <dmi...@zend.com> wrote: > > You missed the second introduced addition: > > > > > > + Z_LVAL_P(result) = Z_LVAL_P(op1) + > > Z_LVAL_P(op2); > > > > So for each $a + $b PHP is going to execute two "add" instructions > instead > > of one (plus memory reads and writes). > > > > No it is not. Please look carefully at the assembly listing that I > shared below. This is GCC generated code, I am not making this up. > > The add is performed only once. The only difference is that the store > register operation which writes the result of the addition back to the > zval is performed conditionally, i.e., only if no overflow occurred. > In fact, the GCC generated code is now closer to the inline assembly > you wrote for x86. > > So before my change: > > >> @ LONG(result) = LONG(op1) + LONG(op2) > >> 9160: f9401ba1 ldr x1, [x29,#48] > >> 9164: f9400260 ldr x0, [x19] > >> 9168: 8b000020 add x0, x1, x0 > > Z_LVAL_P(result) = Z_LVAL_P(op1) + Z_LVAL_P(op2) follows here > > >> 916c: f9000260 str x0, [x19] > >> > >> @ conditional branch on overflow > >> 9170: ca010002 eor x2, x0, x1 > >> 9174: b6f80262 tbz x2, #63, 91c0 > >> <zif_array_sum+0x118> > >> 9178: 39005274 strb w20, [x19,#20] > >> > > > After the change: > > >> 9118: f9400260 ldr x0, [x19] > >> 911c: f9401ba1 ldr x1, [x29,#48] > >> 9120: ca000023 eor x3, x1, x0 > >> 9124: 8b010002 add x2, x0, x1 > >> > >> @ conditional branch on overflow > >> 9128: b6f80583 tbz x3, #63, 91d8 > >> <zif_array_sum+0x130> > > the Z_LVAL_P(result) assignment is moved here, after the conditional branch > > >> 912c: f9000262 str x2, [x19] > >> 9130: 39005274 strb w20, [x19,#20] > >> > > so we have the exact same instruction count, the exact same number of > loads and stores, the only difference is the order. > > -- > Ard. > > > >> > >> I have trouble understanding why this is unacceptable. The only > >> significant difference is that the str and tbz instructions are > >> switched. > >> Can you elaborate please? > >> > >> -- > >> Ard. > >> > >> > >> > It's better to change array_sum() to break aliasing instead of overall > >> > slowdown. > >> > Could you take care about a better fix? > >> > > >> > Thanks. Dmitry. > >> > > >> > > >> > On Tue, Dec 10, 2013 at 3:28 PM, Ard Biesheuvel > >> > <ard.biesheu...@linaro.org> > >> > wrote: > >> >> > >> >> On 10 December 2013 12:25, Dmitry Stogov <dmi...@zend.com> wrote: > >> >> > What exactly are you fixing with this patch? > >> >> > fast_add_function() is used only by VM and opcode operands can't > >> >> > alias > >> >> > (it's > >> >> > guaranteed by compiler). > >> >> > It's also used by array_sum(), but it also can't create aliases > >> >> > between > >> >> > results and operands. > >> >> > > >> >> > What is the reason to slowdown each integer addition? > >> >> > > >> >> > >> >> The patch that was applied to fix > >> >> https://bugs.php.net/bug.php?id=65304 uses calls fast_add_function() > >> >> like this: > >> >> > >> >> fast_add_function(return_value, return_value, &entry_n TSRMLS_CC); > >> >> > >> >> so it does create aliases. If that is undesirable, perhaps we should > >> >> fix array_sum() instead? > >> >> > >> >> However, I think the compiler will be smart enough to factor out the > >> >> op1+op2 operation, so i don't expect any significant slowdown. > >> >> > >> >> -- > >> >> Ard. > >> >> > >> >> > >> >> > >> >> > Thanks. Dmitry. > >> >> > > >> >> > > >> >> > > >> >> > On Tue, Dec 10, 2013 at 3:12 PM, Ard Biesheuvel > >> >> > <ardbiesheu...@php.net> > >> >> > wrote: > >> >> >> > >> >> >> Commit: 60d2e70c062e436a6c6cd3c8a17469a083a38b46 > >> >> >> Author: Ard Biesheuvel <ard.biesheu...@linaro.org> > Tue, > >> >> >> 10 > >> >> >> Dec > >> >> >> 2013 12:07:46 +0100 > >> >> >> Parents: 5a87b7ff39bbf427807c46d1e51e2654259ad394 > >> >> >> Branches: PHP-5.6 master > >> >> >> > >> >> >> Link: > >> >> >> > >> >> >> > >> >> >> > http://git.php.net/?p=php-src.git;a=commitdiff;h=60d2e70c062e436a6c6cd3c8a17469a083a38b46 > >> >> >> > >> >> >> Log: > >> >> >> Zend: fix overflow handling bug in non-x86 fast_add_function() > >> >> >> > >> >> >> The 'result' argument of fast_add_function() may alias with either > >> >> >> of its operands (or both). Take care not to write to 'result' > before > >> >> >> reading op1 and op2. > >> >> >> > >> >> >> Changed paths: > >> >> >> M Zend/zend_operators.h > >> >> >> > >> >> >> > >> >> >> Diff: > >> >> >> diff --git a/Zend/zend_operators.h b/Zend/zend_operators.h > >> >> >> index 0152e03..5c6fc86 100644 > >> >> >> --- a/Zend/zend_operators.h > >> >> >> +++ b/Zend/zend_operators.h > >> >> >> @@ -643,13 +643,18 @@ static zend_always_inline int > >> >> >> fast_add_function(zval > >> >> >> *result, zval *op1, zval *o > >> >> >> "n"(ZVAL_OFFSETOF_TYPE) > >> >> >> : "rax","cc"); > >> >> >> #else > >> >> >> - Z_LVAL_P(result) = Z_LVAL_P(op1) + > >> >> >> Z_LVAL_P(op2); > >> >> >> + /* > >> >> >> + * 'result' may alias with op1 or op2, so > we > >> >> >> need > >> >> >> to > >> >> >> + * ensure that 'result' is not updated > until > >> >> >> after > >> >> >> we > >> >> >> + * have read the values of op1 and op2. > >> >> >> + */ > >> >> >> > >> >> >> if (UNEXPECTED((Z_LVAL_P(op1) & > >> >> >> LONG_SIGN_MASK) > >> >> >> == > >> >> >> (Z_LVAL_P(op2) & LONG_SIGN_MASK) > >> >> >> - && (Z_LVAL_P(op1) & > LONG_SIGN_MASK) > >> >> >> != > >> >> >> (Z_LVAL_P(result) & LONG_SIGN_MASK))) { > >> >> >> + && (Z_LVAL_P(op1) & > LONG_SIGN_MASK) > >> >> >> != > >> >> >> ((Z_LVAL_P(op1) + Z_LVAL_P(op2)) & LONG_SIGN_MASK))) { > >> >> >> Z_DVAL_P(result) = (double) > >> >> >> Z_LVAL_P(op1) > >> >> >> + (double) Z_LVAL_P(op2); > >> >> >> Z_TYPE_P(result) = IS_DOUBLE; > >> >> >> } else { > >> >> >> + Z_LVAL_P(result) = Z_LVAL_P(op1) + > >> >> >> Z_LVAL_P(op2); > >> >> >> Z_TYPE_P(result) = IS_LONG; > >> >> >> } > >> >> >> #endif > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> PHP CVS Mailing List (http://www.php.net/) > >> >> >> To unsubscribe, visit: http://www.php.net/unsub.php > >> >> >> > >> >> > > >> > > >> > > > > > >