Hi Dmitry, all,

Things are looking really good now, f-i-n-a-l-l-y! Last weekend, after thinking "no more changes" again, I thought of a couple more improvements, hah, and now I think there isn't much that could be improved.

So it should be fully alive this weekend, flying, with the fastest/smallest parameter parsing we could imagine, across all of PHP! I guess that means start looking for it next week...? :-) I may just send a patch sooner without even writing up an explanation about parts first like I planned.

More below...

----- Original Message -----
From: "Matt Wilmas"
Sent: Wednesday, August 05, 2015

Hi Dmitry,

[...]

Unfortunately (only because I said "same macro syntax," but no big deal),
the syntax had to be changed, from:

ZEND_PARSE_PARAMETERS_START[_EX](...)
   Z_PARAM_*(...)
   Z_PARAM_*(...)
ZEND_PARSE_PARAMETERS_END[_EX]

to

ZEND_PARSE_PARAMETERS_START[_EX](...)(   // Parentheses
   Z_PARAM_*(...),   // Comma-separated
   Z_PARAM_*(...)
) ZEND_PARSE_PARAMETERS_END[_EX]


Errors in nested macros might be very difficult to understand :(
I would prefer not to use nested macros without a significant gain.

[...]

Anyway though, it doesn't matter much; not sure what you'll want to do with all the possibilities I have! And a simple script converts occurrences to the new syntax for testing (instead of bigger patch).

Significant gain? Nope. :-) I only did that in order to use the "static" storage specifier in one place, for a pointer to the packed rodata, instead of filling it at runtime. But I think the file size was the same with or without static, even though it saved instructions. So not a requirement, just part of my experiments

I was wrong about this; there is "significant gain." Few days after last message, I couldn't even figure out my own code, haha, trying to remember what I tried, when. Anyway, the macro change was NOT to try the "static" specifier (came later), but is the basis for the compiler "magic" that allows the &dest vars to never be referenced -- e.g. no lea, etc. for function. So the minor macro change is important.

Like I said, the BIG neat thing is getting the same optimization (all except the "static" part) for the *traditional* ZPP. I hadn't touched it since last message until this week (doing other stuff and too sick ~4 days to do anything :-/) and wanted to check closer to final code before replying -- but still looks good with GCC so far!

Clang 3.4 is also generating perfect code for compile-time parsing of traditional ZPP's format string. I'll monitor it and GCC closely as final changes are made. I haven't tried older versions yet to see if there's a minimum version to get compile-time transformation.

BTW, I wondered about zpp's "num_args" param -- assumed it was *always* equal to ZEND_NUM_ARGS(), until I saw some instances with a fixed literal number. Oops! Luckily, the "check" can be optimized out in all but one case AFAICT with GCC. And all but a handful of other cases (in pgsql.c) with Clang.

So depending, there's maybe less interest in my smaller FAST_ZPP implementation... *shrug*

Nevermind that comment...

Weeks ago, I thought it might be desired to not give up inlining in all cases to get small code. So I thought about a "hybrid" system where the *smallest* code could be inlined for the *simplest* cases (when function call would have highest % overhead), otherwise call the function. That's in the process of being finished now, with some settings (#define's) to control amount of inlining (or none). I'm hoping the compilers will again remove everything but the few necessary instructions without me having to make explicit checks...

[...]
sub    $0x20,%rsp # 16 bytes more; each parameter needs 16 bytes stack

I realized that the stack space could easily be reduced to 8 bytes per &dest var, instead of 16/param... Doesn't really matter I guess, but now on 64-bit, stack space is same as normal &dest vars, except zend_bools. (Compiler effectively *removes* all dest vars. :^))

[...]
That (optimizing traditional string ZPP) will be the *equivalent* of 64KB+ of C code (repetition), all reduced to nothing. :-) And more of that should (will) be packed together. Hopefully this continues, and with other compilers, on non-Windows anyway.

Don't know about Windows now... Visual Studio 2008 and 2012 (not much difference) are NOT optimizing away the code (other times it was GCC with issues). :-/ Not sure why. Of course they don't support the necessary compound literals anyway, but I was just testing a manual case... I'll have to try and check 2015 version soon.

Nope, VS 2015 still won't optimize away any of it, it seems. So looks like no compile-time transformation of traditional zpp on Windows...

Regardless, there will be a fallback function to be called with optimized runtime string parsing, to be used if compilers don't create optimized code. I'll be checking more compilers, of course...

For the sake of Windows (or any other fallbacks), I really wanted to optimize zpp well for runtime string parsing. After overthinking it, it's fairly simple, and what could've been there all along. What we should wind up with is traditional zpp that's as good as the "fast parsing" function, except: 1) lea, etc. for vars at call site, and 2) the string has to be looked at ONCE, instead of 6-7 times now. :-O

- Matt

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to