Hi Matt,

On Wed, Jul 22, 2015 at 11:16 PM, Matt Wilmas <php_li...@realplain.com>
wrote:

> Hi again Dmitry, all,
>
> Hopefully the final update on this, before all is revealed... :-)
>
> ----- Original Message -----
> From: "Matt Wilmas"
> Sent: Tuesday, July 07, 2015
>
> Hi again Dmitry, all,
>>
>> [...]
>>
>> Just an update...  I didn't abandon this; quite the opposite!  I thought
>> I'd just put the finishing touches on my implementation and have it to you
>> almost a week ago.  After my rough initial test version, I made some
>> obvious, simple changes to reduce instructions/code size (slightly).  And
>> then analyzing different stuff with GCC and MSVC to see if it could be
>> improved more (not really since fairly straightforward), etc...
>>
>> ~5 days ago when I was done messing and changed the macros to recompile
>> the existing FAST_ZPP parts, I didn't know what the size difference would
>> be vs no FAST_ZPP (traditional).  I had overestimated the savings ("maybe a
>> few more bytes" for instructions).  It was in the 30-45% range of your
>> inlined version.
>>
>> I made a change to save instructions, but, strangely, it didn't really
>> have the effect on size I thought it would. :-/
>>
>> BTW, the improvement on Linux with GCC 4.8 was about the same: ~70% of
>> inlined.  So roughly ~2/3 speed for ~1/3 space.  I also finally installed
>> Valgrind and used Callgrind for the first time.  Simple. :^)  About same
>> relative reduction in instructions.
>>
>> I really wanted the code size to be smaller if this could get widespread
>> use, and started wondering, "What if...?", "How?", "Why not?", "But..."
>>
>> Then I had a new idea, but wasn't sure what the compilers would do with
>> it. So I spent Sunday prototyping a couple key parts of it outside of PHP.
>> GCC can make a HUGE mess of it, but easily worked around.  So it looks
>> good, even better than the ideal I had imagined.  Now I just have to do it
>> for PHP...
>>
>> This way saves the lea instructions for each &dest variable (like the
>> inline version), and then some.  And just earlier I realized there's a way
>> to save the other instructions (while using the same macro syntax), which
>> would also apply to the previous implementation.
>>
>> So ideally, this means at the CALL site, we should be able to have the
>> zend_fast_parse_... function call: Just mov+mov+lea+call on 64-bit, and
>> that's it.  The rest of the stuff (a good amount) can be COMPLETELY
>> optimized away! :-O
>>
>> And in the parse_... function, compared to the *inline* FAST_ZPP, that
>> should get it down to about 3 dozen more instructions per parameter: while
>> + switch + checks in zend_parse_arg_* that would get optimized away when
>> inlined.
>>
>> Well, I'll send the implementation(s) for you to test as soon as I can!
>>
>
> I tried to rush and finish things up before the weekend *2 weeks ago*, but
> it took me too long to get the macros sorted out and working right. :-/
> Sorry for the delay, but more and better goodness should now be included.
> The extra time allowed me to "relax and take notes" (Notorious B.I.G.),
> however. :-D
>
> So yeah, that was all working 10 days ago.  Then I realized more function
> param data could be packed together which saved another mov instruction --
> so at the call site, it's just mov+lea+call on 64-bit (since execute_data
> is already in %rdi).  There's nothing else (ignoring checking return
> value/return on error, etc.), and each &dest variable is filled in even
> though their address isn't taken (thanks to compiler magic).  The only
> exceptions are FUNC (4 instructions I think) and OBJECT_OF_CLASS and
> VARIADIC (1 instruction) types.
>
> Unfortunately (only because I said "same macro syntax," but no big deal),
> the syntax had to be changed, from:
>
> ZEND_PARSE_PARAMETERS_START[_EX](...)
>    Z_PARAM_*(...)
>    Z_PARAM_*(...)
> ZEND_PARSE_PARAMETERS_END[_EX]
>
> to
>
> ZEND_PARSE_PARAMETERS_START[_EX](...)(   // Parentheses
>    Z_PARAM_*(...),   // Comma-separated
>    Z_PARAM_*(...)
> ) ZEND_PARSE_PARAMETERS_END[_EX]
>

Errors in nested macros might be very difficult to understand :(
I would prefer not to use nested macros without a significant gain.


>
> Overall, the *code* size is reduced (vs traditional ZPP), but the file
> size isn't (static stuff in rodata or whatever), which was a bit
> surprising, although most of these PHP functions don't have many
> parameters...
>

I may just guess, where this static data came from, because I didn't see
the code yet :)

Thanks. Dmitry.


>
> The biggest size savings actually came from the simple initial
> optimization of zend_parse_params_none().  Down to almost nothing, much
> faster, and saved 4KB on my --disable-all builds.
>
>
> NEW GOODNESS -- What would of course be nice to have is a big optimization
> of the traditional zend_parse[_method]_parameters[_ex|_throw] to avoid
> changing them all.  And it seems some people, like Derick, prefer it.
>
> Of course the obvious way I first had in mind weeks ago was to simply
> parse its format string faster (once-ish) at runtime, and then feed it to
> this new FAST_parse function.  Should give at least 2x speedup I figured.
> But with this latest implementation, where the function should probably now
> be called parse_parameters_ARRAY instead of fast_parse, it would need a
> second pass after parsing the string.  Not a huge deal, but...
>
> What would be *really nice* is to have the compiler parse the format
> string, at compile time, and use the new system directly.  And... that
> should be possible!! 8-)
>
> Last week I figured GCC's "statement expressions" [1] could be used, which
> most compilers seem to support, except MSVC.  But just over the weekend I
> realized an inline function could be used with a compound literal (for the
> varargs), which is also supported in the latest MSVC versions.  Awesome!
>
> And again, fear not, ALL the code can be completely removed by the
> compiler, leaving only movb instructions instead of lea+mov/push for the
> traditional ZPP function call.  So, better than my initial
> implementation(s), and nearly the same as my final macro version!  I was
> just testing prototypes of portions with GCC yesterday, which does fine
> after adjusting to not generate *horribly stupid* code.
>
> Now to implement it into PHP ASAP!  Then I'll save a few more
> banches/instructions in the parse function (specialized for common cases;
> some useless GCC instructions), comment and clean up my experimental mess,
> and write up some explanation of the changes before sending patch.  Oh, and
> I should verify what Clang does with the code as well...
>
> Stay tuned!
>
> [1] https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html
>
>
> - Matt
>

Reply via email to