RE: [PHP-DEV] Introduction and some opcache SSE related stuff
Hi Bogdan, -Original Message- From: Andone, Bogdan [mailto:bogdan.and...@intel.com] Sent: Wednesday, July 29, 2015 4:22 PM To: internals@lists.php.net Subject: [PHP-DEV] Introduction and some opcache SSE related stuff Hi Guys, My name is Bogdan Andone and I work for Intel in the area of SW performance analysis and optimizations. We would like to actively contribute to Zend PHP project and to involve ourselves in finding new performance improvement opportunities based on available and/or new hardware features. I am still in the source code digesting phase but I had a look to the fast_memcpy() implementation in opcache extension which uses SSE intrinsics: If I am not wrong fast_memcpy() function is not currently used, as I didn't find the -msse4.2 gcc flag in the Makefile. I assume you probably didn't see any performance benefit so you preserved generic memcpy() usage. I would like to propose a slightly different implementation which uses _mm_store_si128() instead of _mm_stream_si128(). This ensures that copied memory is preserved in data cache, which is not bad as the interpreter will start to use this data without the need to go back one more time to memory. _mm_stream_si128() in the current implementation is intended to be used for stores where we want to avoid reading data into the cache and the cache pollution; in opcache scenario it seems that preserving the data in cache has a positive impact. Running php-cgi -T1 on WordPress4.1/index.php I see ~1% performance increase for the new version of fast_memcpy() compared with the generic memcpy(). Same result using a full load test with http_load on a Haswell EP 18 cores. Here is the proposed pull request: https://github.com/php/php-src/pull/1446 Related to the SW prefetching instructions in fast_memcpy()... they are not really useful in this place. There benefit is almost negligible as the address requested for prefetch will be needed at the next iteration (few cycles later), while the time needed to get data from RAM is 100 cycles usually.. Nevertheless... they don't heart and it seems they still have a very small benefit so I preserved the original instruction and I added a new prefetch request for the destination pointer. AFAIR we always rely on the standard features, thus SSE2 in this particular case, for better compatibility. IMHO using newer things should be done more carefully. Having more stats could be not bad, from what I see at least here http://store.steampowered.com/hwsurvey it's still not safe to just switch away from SSE2. Maybe introducing some flexible solution like compile time switches for people who want to exhaust features of the modern hardware, or specific features available from vendors, could be an approach. But it all is of course a project definition. Regards Anatol -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] [RFC] Block requests to builtin SQL functions where PHP can prove the call is vulnerable to a potential SQL-injection attack
On 30 Jul 2015, at 13:14, Joe Watkins pthre...@pthreads.org wrote: I find myself agreeing with Pierre; The wrong signal would be sent. History should teach us there is no such thing as (a) safe mode. Hi Joe, Please can you read my proposal (see the email you just replied to, also below)... I'm replying on this thread because my first one was ignored... I'm not suggesting a safe mode or any kind of blocking of requests (as per the subject)... as I agree, and believe that would be worse than the old auto escaping from PHP 4. Craig On 30 Jul 2015, at 13:14, Joe Watkins pthre...@pthreads.org wrote: I find myself agreeing with Pierre; The wrong signal would be sent. History should teach us there is no such thing as (a) safe mode. Xinchen did used to work on a taint extension, I wonder why that was stopped ? Worth noticing that the extension is rather complex, touching many parts of the engine, changing many things ... which I don't really like. Cheers Joe On Thu, Jul 30, 2015 at 10:14 AM, Craig Francis cr...@craigfrancis.co.uk wrote: On 30 Jul 2015, at 08:24, Lester Caine les...@lsces.co.uk wrote: But that is a perfect example of what I am talking about. You do not educate people by publishing the very thing that is wrong. You educate them by pointing out to them WHY the '?' was there in the first place. I completely agree on education, and what I'm hoping for... and this is how we can educate everyone :-) My suggestion for taints (not quite the same as the one from Matt or Wietse) was not to change the way good programs are created/executed, but simply an education device, which can also pick up mistakes that experienced developers make. While my first post on this mailing list gives a better overview: http://news.php.net/php.internals/87207 The original implementation suggestion is at: https://bugs.php.net/bug.php?id=69886 You will see that it does nothing more than create notices to say erm, do you want to be doing this?. This is something that only PHP can do, unless you can find a way of changing every single article / code example on the internet :-) So, with your example... if you want to use a variable for a table/field prefix, that is perfectly fine... in fact, it won't need any changes, as the prefix will probably be hard coded as a string within a PHP script (something I called ETYPE_CONSTANT). But if not (e.g. storing the prefix in an ini file?), then I've shown an example of how that can be handled with the proposed string_encoding_set function (something I should have probably called string_escaping_set)... which is simply to tell PHP that this one variable is already safe (something I can't see being needed very often). Craig On 30 Jul 2015, at 08:24, Lester Caine les...@lsces.co.uk wrote: On 29/07/15 16:11, Craig Francis wrote: I completely disagree... prepared statements are just as vulnerable, and so are ORM's. You can push developers towards these solutions, and that would be good, but you are completely blind if you think an uneducated developer won't do: if ($stmt = $mysqli-prepare(SELECT District FROM City WHERE Name= . $_GET['name'])) { } But that is a perfect example of what I am talking about. You do not educate people by publishing the very thing that is wrong. You educate them by pointing out to them WHY the '?' was there in the first place. Since the taint extension only covers mysql and sqlite it's of little use if we manage to convert 'uneducated developer' to any of the more secure databases, and that was one of the reasons why mysql was dropped from being loaded by default. Once one starts from a base of parametrised sql queries the lax programming methods many mysql guides and books continue to push can be reversed. Throwing more bloat into php to create 'WTF' errors just adds to a new users frustration and annoys experienced users who have very good reasons for building queries using clean variables. MANY abstraction layers use variables to add prefixes to table names or fields. Educate ... don't nanny ... -- Lester Caine - G8HFL - Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Introduction and some opcache SSE related stuff
Hey: On Thu, Jul 30, 2015 at 8:24 PM, Joe Watkins pthre...@pthreads.org wrote: Hi Andone, I'm not sure why nobody has replied to you yet, we've all looked at the PR and spent a lot of the day yesterday discussing it. I've CC'd Dmitry, he doesn't always read internals, so this should get his attention. Sorry for late response, and Dmitry is on vacation now. so, he probably not be able to reply this soon. anyway, is the performance improvement is consistently be seen? have you tested it with some profiling tool? IR reduced or cache misses reduced? thanks Lastly, very cool ... I look forward to some more cleverness ... Cheers Joe On Wed, Jul 29, 2015 at 3:22 PM, Andone, Bogdan bogdan.and...@intel.com wrote: Hi Guys, My name is Bogdan Andone and I work for Intel in the area of SW performance analysis and optimizations. We would like to actively contribute to Zend PHP project and to involve ourselves in finding new performance improvement opportunities based on available and/or new hardware features. I am still in the source code digesting phase but I had a look to the fast_memcpy() implementation in opcache extension which uses SSE intrinsics: If I am not wrong fast_memcpy() function is not currently used, as I didn't find the -msse4.2 gcc flag in the Makefile. I assume you probably didn't see any performance benefit so you preserved generic memcpy() usage. I would like to propose a slightly different implementation which uses _mm_store_si128() instead of _mm_stream_si128(). This ensures that copied memory is preserved in data cache, which is not bad as the interpreter will start to use this data without the need to go back one more time to memory. _mm_stream_si128() in the current implementation is intended to be used for stores where we want to avoid reading data into the cache and the cache pollution; in opcache scenario it seems that preserving the data in cache has a positive impact. Running php-cgi -T1 on WordPress4.1/index.php I see ~1% performance increase for the new version of fast_memcpy() compared with the generic memcpy(). Same result using a full load test with http_load on a Haswell EP 18 cores. Here is the proposed pull request: https://github.com/php/php-src/pull/1446 Related to the SW prefetching instructions in fast_memcpy()... they are not really useful in this place. There benefit is almost negligible as the address requested for prefetch will be needed at the next iteration (few then maybe we don't need this in fast_memcpy? I mean it maybe used widely if is is proven to be faster, which will be out of this context. thanks cycles later), while the time needed to get data from RAM is 100 cycles usually.. Nevertheless... they don't heart and it seems they still have a very small benefit so I preserved the original instruction and I added a new prefetch request for the destination pointer. Hope it helps, Bogdan -- Xinchen Hui @Laruence http://www.laruence.com/ -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Disabling External Entities in libxml By Default
On 7/29/15 6:01 PM, Stanislav Malyshev wrote: Hi! Currently, PHP by default is vulnerable to XXE attacks: https://www.owasp.org/index.php/XML_External_Entity_(XXE)_Processing To bypass this, you need to turn off external entity loading: libxml_disable_entity_loader(true); AFAIR right now, due to how it is implemented, this blocks loading XML content from files with something like XMLReader::open() - due to the use of the same code path by both. It may have changes since last time I looked, but it definitely was a major reason why default stayed that way. What people did is something like that: libxml_disable_entity_loader( false ); $reader-open( $filename ); libxml_disable_entity_loader( true ); I imagine we could do better. But we need to be careful - if we just set it as disabled, we could break a lot of unsuspecting apps that do nothing more that reading XML files. It hasn't changed since then. I've gone back and forth on this in the past. I've looked at both allowing the initial file to be read as well as potentially adding support for file resources. The latter requires app changes anyways so no big advantage there, tho would get the job done. I'm still going back and forth on it as a few things come to mind. If you are already working with a trusted document then you should safely be able to disable the entity loader. If you aren't then wouldn't you want to do some sort of checking (especially if you dont have an XML gateway fronting the system) for other malicious things before even opening the document regardless if it has external entities or not. The intial doc is really an external at that point. In our systems the entity loader is disabled right off the bat so if any developer needs to work with xml anywhere else in the system, they have to explicitly enable it, load and then disable it. Requires a conscious effort on their part as well as makes it easier to audit their usage of it and the trustworthiness of their data. imo it might be better to leave the entity loader function as is and introduce a new function which can be enabled by default yet allow for the initial data to be read yet not allow any external loading from there. This way you are not going to have to relax the security aspect of the current function (where the current function always overrides the ability to read the initial data if the entity loader is disabled). One thing to be careful of tho is the breakage of unsuspecting apps. For anyone who relies on external DTD, schemas, xslt includes, etc.. even within their own environment, if they are not currently using the entity loader function now, they are certain to break. Whichever direction the consensus is for this I'm more than willing to help implement. Anyways just my 2 cents. Rob -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Benchmark Results for PHP Master 2015-07-30
2015-07-30 14:42 GMT+02:00 Andone, Bogdan bogdan.and...@intel.com: -Original Message- From: Niklas Keller [mailto:m...@kelunik.com] Sent: Thursday, July 30, 2015 1:47 PM To: Pierre Joye Cc: lp_benchmark_robot; PHP internals; l...@lists.01.org Subject: Re: [PHP-DEV] Benchmark Results for PHP Master 2015-07-30 2015-07-30 11:57 GMT+02:00 Pierre Joye pierre@gmail.com: Hi, Does someone has a contact there? It would be nicer to have these results combined with what we pushed on qa.php.net as well. Cheers, Pierre Thought about that as well, results per mail aren't that useful, especially as they're badly formatted for me in GMail (no fixed font). A graph visualizing those numbers would be nice. Regards, Niklas Hi Guys, We are glad that our small ticking spam start to be observed :) ! We would like to offer valuable information to the community related to performance trends of the PHP project on Intel platforms based on daily builds and we are open for suggestions for making these results relevant. We chose to share our numbers as plain text mails for easily seeing the summary snapshots on discussion lists without the need of other clicks. Everybody agrees that plain text is ugly and, yes, you need to have fixed font in place for having the table formatted correctly. Let’s discuss a better way of doing; integration with qa.php.net is possible if we find the right interface for sharing data in an automated way. Normally l...@lists.01.org should be the official entry for feedbacks and requests but, unfortunately, it is not yet operational, so I will be your direct contact as I am part of the team which deploys this project. Kind regards, Bogdan Hi Bogdan, I think absolute numbers (instead of a %-change) would be better suited for visualizing performance over time. Regards, Niklas
RE: [PHP-DEV] Benchmark Results for PHP Master 2015-07-30
-Original Message- From: Niklas Keller [mailto:m...@kelunik.com] Sent: Thursday, July 30, 2015 1:47 PM To: Pierre Joye Cc: lp_benchmark_robot; PHP internals; l...@lists.01.org Subject: Re: [PHP-DEV] Benchmark Results for PHP Master 2015-07-30 2015-07-30 11:57 GMT+02:00 Pierre Joye pierre@gmail.com: Hi, Does someone has a contact there? It would be nicer to have these results combined with what we pushed on qa.php.net as well. Cheers, Pierre Thought about that as well, results per mail aren't that useful, especially as they're badly formatted for me in GMail (no fixed font). A graph visualizing those numbers would be nice. Regards, Niklas Hi Guys, We are glad that our small ticking spam start to be observed :) ! We would like to offer valuable information to the community related to performance trends of the PHP project on Intel platforms based on daily builds and we are open for suggestions for making these results relevant. We chose to share our numbers as plain text mails for easily seeing the summary snapshots on discussion lists without the need of other clicks. Everybody agrees that plain text is ugly and, yes, you need to have fixed font in place for having the table formatted correctly. Let’s discuss a better way of doing; integration with qa.php.net is possible if we find the right interface for sharing data in an automated way. Normally l...@lists.01.org should be the official entry for feedbacks and requests but, unfortunately, it is not yet operational, so I will be your direct contact as I am part of the team which deploys this project. Kind regards, Bogdan
Re: [PHP-DEV] [RFC] Block requests to builtin SQL functions where PHP can prove the call is vulnerable to a potential SQL-injection attack
I find myself agreeing with Pierre; The wrong signal would be sent. History should teach us there is no such thing as (a) safe mode. Xinchen did used to work on a taint extension, I wonder why that was stopped ? Worth noticing that the extension is rather complex, touching many parts of the engine, changing many things ... which I don't really like. Cheers Joe On Thu, Jul 30, 2015 at 10:14 AM, Craig Francis cr...@craigfrancis.co.uk wrote: On 30 Jul 2015, at 08:24, Lester Caine les...@lsces.co.uk wrote: But that is a perfect example of what I am talking about. You do not educate people by publishing the very thing that is wrong. You educate them by pointing out to them WHY the '?' was there in the first place. I completely agree on education, and what I'm hoping for... and this is how we can educate everyone :-) My suggestion for taints (not quite the same as the one from Matt or Wietse) was not to change the way good programs are created/executed, but simply an education device, which can also pick up mistakes that experienced developers make. While my first post on this mailing list gives a better overview: http://news.php.net/php.internals/87207 The original implementation suggestion is at: https://bugs.php.net/bug.php?id=69886 You will see that it does nothing more than create notices to say erm, do you want to be doing this?. This is something that only PHP can do, unless you can find a way of changing every single article / code example on the internet :-) So, with your example... if you want to use a variable for a table/field prefix, that is perfectly fine... in fact, it won't need any changes, as the prefix will probably be hard coded as a string within a PHP script (something I called ETYPE_CONSTANT). But if not (e.g. storing the prefix in an ini file?), then I've shown an example of how that can be handled with the proposed string_encoding_set function (something I should have probably called string_escaping_set)... which is simply to tell PHP that this one variable is already safe (something I can't see being needed very often). Craig On 30 Jul 2015, at 08:24, Lester Caine les...@lsces.co.uk wrote: On 29/07/15 16:11, Craig Francis wrote: I completely disagree... prepared statements are just as vulnerable, and so are ORM's. You can push developers towards these solutions, and that would be good, but you are completely blind if you think an uneducated developer won't do: if ($stmt = $mysqli-prepare(SELECT District FROM City WHERE Name= . $_GET['name'])) { } But that is a perfect example of what I am talking about. You do not educate people by publishing the very thing that is wrong. You educate them by pointing out to them WHY the '?' was there in the first place. Since the taint extension only covers mysql and sqlite it's of little use if we manage to convert 'uneducated developer' to any of the more secure databases, and that was one of the reasons why mysql was dropped from being loaded by default. Once one starts from a base of parametrised sql queries the lax programming methods many mysql guides and books continue to push can be reversed. Throwing more bloat into php to create 'WTF' errors just adds to a new users frustration and annoys experienced users who have very good reasons for building queries using clean variables. MANY abstraction layers use variables to add prefixes to table names or fields. Educate ... don't nanny ... -- Lester Caine - G8HFL - Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] [RFC] Block requests to builtin SQL functions where PHP can prove the call is vulnerable to a potential SQL-injection attack
Hey: On Thu, Jul 30, 2015 at 8:14 PM, Joe Watkins pthre...@pthreads.org wrote: I find myself agreeing with Pierre; The wrong signal would be sent. History should teach us there is no such thing as (a) safe mode. Xinchen did used to work on a taint extension, I wonder why that was stopped ? yes, it is https://github.com/laruence/php-taint Anyway, I was too busy so I didn't make it supports PHP-5.6, I was hoping someone could help(it supports 5.5 now). it is a complex extension, and using tricky way to keep taint infos anyway, with PHP7's new zend_string, and string flags, the implementation will become easier. I have a plan to make it supports PHP7.. thanks Worth noticing that the extension is rather complex, touching many parts of the engine, changing many things ... which I don't really like. Cheers Joe On Thu, Jul 30, 2015 at 10:14 AM, Craig Francis cr...@craigfrancis.co.uk wrote: On 30 Jul 2015, at 08:24, Lester Caine les...@lsces.co.uk wrote: But that is a perfect example of what I am talking about. You do not educate people by publishing the very thing that is wrong. You educate them by pointing out to them WHY the '?' was there in the first place. I completely agree on education, and what I'm hoping for... and this is how we can educate everyone :-) My suggestion for taints (not quite the same as the one from Matt or Wietse) was not to change the way good programs are created/executed, but simply an education device, which can also pick up mistakes that experienced developers make. While my first post on this mailing list gives a better overview: http://news.php.net/php.internals/87207 The original implementation suggestion is at: https://bugs.php.net/bug.php?id=69886 You will see that it does nothing more than create notices to say erm, do you want to be doing this?. This is something that only PHP can do, unless you can find a way of changing every single article / code example on the internet :-) So, with your example... if you want to use a variable for a table/field prefix, that is perfectly fine... in fact, it won't need any changes, as the prefix will probably be hard coded as a string within a PHP script (something I called ETYPE_CONSTANT). But if not (e.g. storing the prefix in an ini file?), then I've shown an example of how that can be handled with the proposed string_encoding_set function (something I should have probably called string_escaping_set)... which is simply to tell PHP that this one variable is already safe (something I can't see being needed very often). Craig On 30 Jul 2015, at 08:24, Lester Caine les...@lsces.co.uk wrote: On 29/07/15 16:11, Craig Francis wrote: I completely disagree... prepared statements are just as vulnerable, and so are ORM's. You can push developers towards these solutions, and that would be good, but you are completely blind if you think an uneducated developer won't do: if ($stmt = $mysqli-prepare(SELECT District FROM City WHERE Name= . $_GET['name'])) { } But that is a perfect example of what I am talking about. You do not educate people by publishing the very thing that is wrong. You educate them by pointing out to them WHY the '?' was there in the first place. Since the taint extension only covers mysql and sqlite it's of little use if we manage to convert 'uneducated developer' to any of the more secure databases, and that was one of the reasons why mysql was dropped from being loaded by default. Once one starts from a base of parametrised sql queries the lax programming methods many mysql guides and books continue to push can be reversed. Throwing more bloat into php to create 'WTF' errors just adds to a new users frustration and annoys experienced users who have very good reasons for building queries using clean variables. MANY abstraction layers use variables to add prefixes to table names or fields. Educate ... don't nanny ... -- Lester Caine - G8HFL - Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php -- Xinchen Hui @Laruence http://www.laruence.com/ -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Introduction and some opcache SSE related stuff
Hi Andone, I'm not sure why nobody has replied to you yet, we've all looked at the PR and spent a lot of the day yesterday discussing it. I've CC'd Dmitry, he doesn't always read internals, so this should get his attention. Lastly, very cool ... I look forward to some more cleverness ... Cheers Joe On Wed, Jul 29, 2015 at 3:22 PM, Andone, Bogdan bogdan.and...@intel.com wrote: Hi Guys, My name is Bogdan Andone and I work for Intel in the area of SW performance analysis and optimizations. We would like to actively contribute to Zend PHP project and to involve ourselves in finding new performance improvement opportunities based on available and/or new hardware features. I am still in the source code digesting phase but I had a look to the fast_memcpy() implementation in opcache extension which uses SSE intrinsics: If I am not wrong fast_memcpy() function is not currently used, as I didn't find the -msse4.2 gcc flag in the Makefile. I assume you probably didn't see any performance benefit so you preserved generic memcpy() usage. I would like to propose a slightly different implementation which uses _mm_store_si128() instead of _mm_stream_si128(). This ensures that copied memory is preserved in data cache, which is not bad as the interpreter will start to use this data without the need to go back one more time to memory. _mm_stream_si128() in the current implementation is intended to be used for stores where we want to avoid reading data into the cache and the cache pollution; in opcache scenario it seems that preserving the data in cache has a positive impact. Running php-cgi -T1 on WordPress4.1/index.php I see ~1% performance increase for the new version of fast_memcpy() compared with the generic memcpy(). Same result using a full load test with http_load on a Haswell EP 18 cores. Here is the proposed pull request: https://github.com/php/php-src/pull/1446 Related to the SW prefetching instructions in fast_memcpy()... they are not really useful in this place. There benefit is almost negligible as the address requested for prefetch will be needed at the next iteration (few cycles later), while the time needed to get data from RAM is 100 cycles usually.. Nevertheless... they don't heart and it seems they still have a very small benefit so I preserved the original instruction and I added a new prefetch request for the destination pointer. Hope it helps, Bogdan
Re: [PHP-DEV] Reclassify E_STRICT notices
On Thu, 30 Jul 2015, Ferenc Kovacs wrote: On Sun, Feb 22, 2015 at 11:30 PM, Nikita Popov nikita@gmail.com wrote: I would like to propose reclassifying our few existing E_STRICT notices and removing this error category: https://wiki.php.net/rfc/reclassify_e_strict As we don't really have good guidelines on when which type of error should be thrown, I'm mainly going by what category other similar errors use. I'm open to suggestions, but hope this will not deteriorate into total bikeshed. this RFC got accepted, but there are 4 more E_STRICTs in the core which were kept/missed: http://lxr.php.net/xref/PHP_TRUNK/ext/date/php_date.c#1544 I think E_DEPRECATED works for this as well - which I am perfectly happy with. cheers, Derick -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] [RFC] Block requests to builtin SQL functions where PHP can prove the call is vulnerable to a potential SQL-injection attack
On 30 Jul 2015, at 16:24, Scott Arciszewski sc...@paragonie.com wrote: Just because the solution is known doesn't mean it's known to everyone. Yes, and if you could just read what I was suggesting, rather than looking at the subject of this email (and the suggestion by Matt), then you will notice this is what I'm trying to do (so not just people asking questions on Stack Overflow). My suggestion is to educate, it also has a nice side effect of having a simple checking process for everything else (without breaking anything). On 30 Jul 2015, at 16:24, Scott Arciszewski sc...@paragonie.com wrote: On Thu, Jul 30, 2015 at 11:20 AM, Craig Francis cr...@craigfrancis.co.uk wrote: On 30 Jul 2015, at 14:43, Scott Arciszewski sc...@paragonie.com wrote: This may have been true at one point in time, but my own experience and the statistics collected by Dan Kaminsky of White Hat Security indicates that Cross-Site Scripting vulnerabilities are much more prevalent in 2015 than SQL Injection, especially in business applications. Good, because my suggestion was also addressing XSS with poor (or completely missing) HTML escaping... have a look: http://news.php.net/php.internals/87207 https://bugs.php.net/bug.php?id=69886 Now I admit it won't fix everything with XSS (as HTML escaping is a bit harder), but it certainly will pick up quite a lot of the issues (and it wont break anything either, just help developers identify problems). And no, SQL injection is far from a solved problem... this is why, after 15 years of me trying to tell my fellow developers to not make these mistakes, I'm still finding them making them over and over again... hence why I'm making the above suggestion. Craig On 30 Jul 2015, at 14:43, Scott Arciszewski sc...@paragonie.com wrote: On Tue, Jul 28, 2015 at 1:33 PM, Matt Tait matt.t...@gmail.com wrote: Hi all, I've written an RFC (and PoC) about automatic detection and blocking of SQL injection vulnerabilities directly from inside PHP via automated taint analysis. https://wiki.php.net/rfc/sql_injection_protection In short, we make zend_strings track where their value originated. If it originated as a T_STRING, from a primitive (like int) promotion, or as a concatenation of such strings, it's query that can't have been SQL-injected by an attacker controlled string. If we can't prove that the query is safe, that means that the query is either certainly vulnerable to a SQL-injection vulnerability, or sufficiently complex that it should be parameterized just-to-be-sure. There's also a working proof of concept over here: http://phpoops.cloudapp.net/oops.php You'll notice that the page makes a large number of SQL statements, most of which are not vulnerable to SQL injection, but one is. The proof of concept is smart enough to block that one vulnerable request, and leave all of the others unchanged. In terms of performance, the cost here is negligible. This is just basic variable taint analysis under the hood, (not an up-front intraprocedurale static analysis or anything complex) so there's basically no slow down. PHP SQL injections are the #1 way PHP applications get hacked - and all SQL injections are the result of a developer either not understanding how to prevent SQL injection, or taking a shortcut because it's fewer keystrokes to do it a feels safe rather than is safe way. What do you all think? There's obviously a bit more work to do; the PoC currently only covers mysqli_query, but I thought this stage is an interesting point to throw it open to comments before working to complete it. Matt Hi Matt, PHP SQL injections are the #1 way PHP applications get hacked - and all SQL injections are the result of a developer either not understanding how to prevent SQL injection, or taking a shortcut because it's fewer keystrokes to do it a feels safe rather than is safe way. This may have been true at one point in time, but my own experience and the statistics collected by Dan Kaminsky of White Hat Security indicates that Cross-Site Scripting vulnerabilities are much more prevalent in 2015 than SQL Injection, especially in business applications. If Google has information that indicates that SQLi is still more prevalent than XSS, I'd love to see this data. In my opinion, SQL injection is almost a solved problem. Use prepared statements where you can, and strictly whitelist where you cannot (i.e. ORDER BY {$column} ASC) Scott Arciszewski Chief Development Officer Paragon Initiative Enterprises https://paragonie.com -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php Just because the solution is known doesn't mean it's known to everyone. Diffusion of knowledge and good habits is the hardest problem in application security to solve. Look, for example, at how many college students learn to write
Re: [PHP-DEV] Disabling External Entities in libxml By Default
Rob Richards wrote on 30/07/2015 14:12: If you are already working with a trusted document then you should safely be able to disable the entity loader. If you aren't then wouldn't you want to do some sort of checking (especially if you dont have an XML gateway fronting the system) for other malicious things before even opening the document regardless if it has external entities or not. Can you give any pointers to what kind of checking this would be, and how it would be carried out without parsing the XML document in the first place? According to the bug report, one of the affected uses is the SoapClient, which by definition is dealing with remote data. I can see how that could be considered untrusted, but I can't think of any particular action that would make it more trusted (quite apart from the lack of an obvious point to intercept the data before it is parsed). Would it not make more sense for the parser to operate in an untrusted mode - disabling external entities, maybe different limits on stack depth, etc? Regards, -- Rowan Collins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] [RFC] Block requests to builtin SQL functions where PHP can prove the call is vulnerable to a potential SQL-injection attack
On Thu, Jul 30, 2015 at 11:20 AM, Craig Francis cr...@craigfrancis.co.uk wrote: On 30 Jul 2015, at 14:43, Scott Arciszewski sc...@paragonie.com wrote: This may have been true at one point in time, but my own experience and the statistics collected by Dan Kaminsky of White Hat Security indicates that Cross-Site Scripting vulnerabilities are much more prevalent in 2015 than SQL Injection, especially in business applications. Good, because my suggestion was also addressing XSS with poor (or completely missing) HTML escaping... have a look: http://news.php.net/php.internals/87207 https://bugs.php.net/bug.php?id=69886 Now I admit it won't fix everything with XSS (as HTML escaping is a bit harder), but it certainly will pick up quite a lot of the issues (and it wont break anything either, just help developers identify problems). And no, SQL injection is far from a solved problem... this is why, after 15 years of me trying to tell my fellow developers to not make these mistakes, I'm still finding them making them over and over again... hence why I'm making the above suggestion. Craig On 30 Jul 2015, at 14:43, Scott Arciszewski sc...@paragonie.com wrote: On Tue, Jul 28, 2015 at 1:33 PM, Matt Tait matt.t...@gmail.com wrote: Hi all, I've written an RFC (and PoC) about automatic detection and blocking of SQL injection vulnerabilities directly from inside PHP via automated taint analysis. https://wiki.php.net/rfc/sql_injection_protection In short, we make zend_strings track where their value originated. If it originated as a T_STRING, from a primitive (like int) promotion, or as a concatenation of such strings, it's query that can't have been SQL-injected by an attacker controlled string. If we can't prove that the query is safe, that means that the query is either certainly vulnerable to a SQL-injection vulnerability, or sufficiently complex that it should be parameterized just-to-be-sure. There's also a working proof of concept over here: http://phpoops.cloudapp.net/oops.php You'll notice that the page makes a large number of SQL statements, most of which are not vulnerable to SQL injection, but one is. The proof of concept is smart enough to block that one vulnerable request, and leave all of the others unchanged. In terms of performance, the cost here is negligible. This is just basic variable taint analysis under the hood, (not an up-front intraprocedurale static analysis or anything complex) so there's basically no slow down. PHP SQL injections are the #1 way PHP applications get hacked - and all SQL injections are the result of a developer either not understanding how to prevent SQL injection, or taking a shortcut because it's fewer keystrokes to do it a feels safe rather than is safe way. What do you all think? There's obviously a bit more work to do; the PoC currently only covers mysqli_query, but I thought this stage is an interesting point to throw it open to comments before working to complete it. Matt Hi Matt, PHP SQL injections are the #1 way PHP applications get hacked - and all SQL injections are the result of a developer either not understanding how to prevent SQL injection, or taking a shortcut because it's fewer keystrokes to do it a feels safe rather than is safe way. This may have been true at one point in time, but my own experience and the statistics collected by Dan Kaminsky of White Hat Security indicates that Cross-Site Scripting vulnerabilities are much more prevalent in 2015 than SQL Injection, especially in business applications. If Google has information that indicates that SQLi is still more prevalent than XSS, I'd love to see this data. In my opinion, SQL injection is almost a solved problem. Use prepared statements where you can, and strictly whitelist where you cannot (i.e. ORDER BY {$column} ASC) Scott Arciszewski Chief Development Officer Paragon Initiative Enterprises https://paragonie.com -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php Just because the solution is known doesn't mean it's known to everyone. Diffusion of knowledge and good habits is the hardest problem in application security to solve. Look, for example, at how many college students learn to write C programs with buffer overflow vulnerabilities in 2015. We need more effort on education, which is part of what I've been focusing on with Paragon Initiative and Stack Overflow. Scott Arciszewski Chief Development Officer Paragon Initiative Enterprises https://paragonie.com -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Introduction and some opcache SSE related stuff
Hi Bogdan, On Wed, Jul 29, 2015 at 5:22 PM, Andone, Bogdan bogdan.and...@intel.com wrote: Hi Guys, My name is Bogdan Andone and I work for Intel in the area of SW performance analysis and optimizations. We would like to actively contribute to Zend PHP project and to involve ourselves in finding new performance improvement opportunities based on available and/or new hardware features. I am still in the source code digesting phase but I had a look to the fast_memcpy() implementation in opcache extension which uses SSE intrinsics: If I am not wrong fast_memcpy() function is not currently used, as I didn't find the -msse4.2 gcc flag in the Makefile. I assume you probably didn't see any performance benefit so you preserved generic memcpy() usage. This is not SSE4.2 this is SSE2. Any X86_64 target implements SSE2, so it's enabled by default on x86_64 systems (at least on Linux). It also may be enabled on x86 targets adding -msse2 option. I would like to propose a slightly different implementation which uses _mm_store_si128() instead of _mm_stream_si128(). This ensures that copied memory is preserved in data cache, which is not bad as the interpreter will start to use this data without the need to go back one more time to memory. _mm_stream_si128() in the current implementation is intended to be used for stores where we want to avoid reading data into the cache and the cache pollution; in opcache scenario it seems that preserving the data in cache has a positive impact. _mm_stream_si128() was used on purpose, to avoid CPU cache pollution, because data copied from SHM to process memory is not necessary used before eviction. By the way, I'm not completely sure. May be _mm_store_si128() can provide better result. Running php-cgi -T1 on WordPress4.1/index.php I see ~1% performance increase for the new version of fast_memcpy() compared with the generic memcpy(). Same result using a full load test with http_load on a Haswell EP 18 cores. 1% is really big improvement. I'll able to check this only on next week (when back from vacation). Here is the proposed pull request: https://github.com/php/php-src/pull/1446 Related to the SW prefetching instructions in fast_memcpy()... they are not really useful in this place. There benefit is almost negligible as the address requested for prefetch will be needed at the next iteration (few cycles later), while the time needed to get data from RAM is 100 cycles usually.. Nevertheless... they don't heart and it seems they still have a very small benefit so I preserved the original instruction and I added a new prefetch request for the destination pointer. I also didn't see significant difference from software prefetching. Thanks. Dmitry. Hope it helps, Bogdan
Re: [PHP-DEV] [RFC] Block requests to builtin SQL functions where PHP can prove the call is vulnerable to a potential SQL-injection attack
On 30 Jul 2015, at 16:26, Ronald Chmara rona...@gmail.com wrote: Perhaps I have missed something in this discussion I think you have... my email from a couple of weeks ago was ignored... so I replied to Matt's suggestion (which is similar, but different). Please, just spend a few minutes reading my suggestion, it has absolutely nothing todo with breaking applications: http://news.php.net/php.internals/87207 https://bugs.php.net/bug.php?id=69886 And yes, I do have a bypass_the_nerfing function (well, a function to say the variable has already been escaped)... but the idea is that it's ever so slightly harder to use than the related escaping functions, and rarely needed. On 30 Jul 2015, at 16:26, Ronald Chmara rona...@gmail.com wrote: Perhaps I have missed something in this discussion where such a change to PHP does not break every single application that is supposed to pass raw, user submitted, SQL *without* getting prepared/nerfed, or warned about, by intentional application design. If we're just limiting the nerfing for submitted GPC variables (since PHP is used a lot for web applications) we still have a non-trivial number of those installed applications which require raw, user created, unescaped SQL, passing through to function as designed. I am thinking of the class of applications like phpMyAdmin, as well as the the millions of other database utility scripts, application install scripts, (etc.) out there that perform similar tasks, that need to pass raw SQL, as crafted by users, without preparation, intentionally. Of course, we could just add a bypass_the_nerfing() function, and such a function could then possibly see widespread adoption, everywhere, rendering the entire exercise moot. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] [RFC] Block requests to builtin SQL functions where PHP can prove the call is vulnerable to a potential SQL-injection attack
On 30 Jul 2015, at 13:47, Xinchen Hui larue...@php.net wrote: anyway, with PHP7's new zend_string, and string flags, the implementation will become easier. Hi Xinchen, Glad to hear that you are still looking into this... please let me know if there is anything I can do to help (unfortunately I'm not a C programer). Out of interest, if you are going to continue using taint_marks, as the RFC suggested... can I suggest all variables start with 0 (or undefined)... then you can set flags as the variables are passed though functions like htmlentities and pg_escape_literal? This way all variables are treated as plain (unsafe), and then developers either need to escape them (e.g. when printing to output, or using in SQL, CLI, etc), or they can mark them as having already been escaped (rare). Likewise, I know you have examples that say SCRIPT_FILENAME is safe by default (I kind of disagree)... it would still be advisable to encode them, even if they are being included in the HTML... personally I would not have any variables marked as safe by default (with the single exception of strings that are defined in the PHP code itself). Craig On 30 Jul 2015, at 13:47, Xinchen Hui larue...@php.net wrote: Hey: On Thu, Jul 30, 2015 at 8:14 PM, Joe Watkins pthre...@pthreads.org wrote: I find myself agreeing with Pierre; The wrong signal would be sent. History should teach us there is no such thing as (a) safe mode. Xinchen did used to work on a taint extension, I wonder why that was stopped ? yes, it is https://github.com/laruence/php-taint Anyway, I was too busy so I didn't make it supports PHP-5.6, I was hoping someone could help(it supports 5.5 now). it is a complex extension, and using tricky way to keep taint infos anyway, with PHP7's new zend_string, and string flags, the implementation will become easier. I have a plan to make it supports PHP7.. thanks Worth noticing that the extension is rather complex, touching many parts of the engine, changing many things ... which I don't really like. Cheers Joe On Thu, Jul 30, 2015 at 10:14 AM, Craig Francis cr...@craigfrancis.co.uk wrote: On 30 Jul 2015, at 08:24, Lester Caine les...@lsces.co.uk wrote: But that is a perfect example of what I am talking about. You do not educate people by publishing the very thing that is wrong. You educate them by pointing out to them WHY the '?' was there in the first place. I completely agree on education, and what I'm hoping for... and this is how we can educate everyone :-) My suggestion for taints (not quite the same as the one from Matt or Wietse) was not to change the way good programs are created/executed, but simply an education device, which can also pick up mistakes that experienced developers make. While my first post on this mailing list gives a better overview: http://news.php.net/php.internals/87207 The original implementation suggestion is at: https://bugs.php.net/bug.php?id=69886 You will see that it does nothing more than create notices to say erm, do you want to be doing this?. This is something that only PHP can do, unless you can find a way of changing every single article / code example on the internet :-) So, with your example... if you want to use a variable for a table/field prefix, that is perfectly fine... in fact, it won't need any changes, as the prefix will probably be hard coded as a string within a PHP script (something I called ETYPE_CONSTANT). But if not (e.g. storing the prefix in an ini file?), then I've shown an example of how that can be handled with the proposed string_encoding_set function (something I should have probably called string_escaping_set)... which is simply to tell PHP that this one variable is already safe (something I can't see being needed very often). Craig On 30 Jul 2015, at 08:24, Lester Caine les...@lsces.co.uk wrote: On 29/07/15 16:11, Craig Francis wrote: I completely disagree... prepared statements are just as vulnerable, and so are ORM's. You can push developers towards these solutions, and that would be good, but you are completely blind if you think an uneducated developer won't do: if ($stmt = $mysqli-prepare(SELECT District FROM City WHERE Name= . $_GET['name'])) { } But that is a perfect example of what I am talking about. You do not educate people by publishing the very thing that is wrong. You educate them by pointing out to them WHY the '?' was there in the first place. Since the taint extension only covers mysql and sqlite it's of little use if we manage to convert 'uneducated developer' to any of the more secure databases, and that was one of the reasons why mysql was dropped from being loaded by default. Once one starts from a base of parametrised sql queries the lax programming methods many mysql guides and books continue to push can be reversed. Throwing more bloat into php to create 'WTF' errors
Re: [PHP-DEV] [RFC] Block requests to builtin SQL functions where PHP can prove the call is vulnerable to a potential SQL-injection attack
On 30 Jul 2015, at 14:43, Scott Arciszewski sc...@paragonie.com wrote: This may have been true at one point in time, but my own experience and the statistics collected by Dan Kaminsky of White Hat Security indicates that Cross-Site Scripting vulnerabilities are much more prevalent in 2015 than SQL Injection, especially in business applications. Good, because my suggestion was also addressing XSS with poor (or completely missing) HTML escaping... have a look: http://news.php.net/php.internals/87207 https://bugs.php.net/bug.php?id=69886 Now I admit it won't fix everything with XSS (as HTML escaping is a bit harder), but it certainly will pick up quite a lot of the issues (and it wont break anything either, just help developers identify problems). And no, SQL injection is far from a solved problem... this is why, after 15 years of me trying to tell my fellow developers to not make these mistakes, I'm still finding them making them over and over again... hence why I'm making the above suggestion. Craig On 30 Jul 2015, at 14:43, Scott Arciszewski sc...@paragonie.com wrote: On Tue, Jul 28, 2015 at 1:33 PM, Matt Tait matt.t...@gmail.com wrote: Hi all, I've written an RFC (and PoC) about automatic detection and blocking of SQL injection vulnerabilities directly from inside PHP via automated taint analysis. https://wiki.php.net/rfc/sql_injection_protection In short, we make zend_strings track where their value originated. If it originated as a T_STRING, from a primitive (like int) promotion, or as a concatenation of such strings, it's query that can't have been SQL-injected by an attacker controlled string. If we can't prove that the query is safe, that means that the query is either certainly vulnerable to a SQL-injection vulnerability, or sufficiently complex that it should be parameterized just-to-be-sure. There's also a working proof of concept over here: http://phpoops.cloudapp.net/oops.php You'll notice that the page makes a large number of SQL statements, most of which are not vulnerable to SQL injection, but one is. The proof of concept is smart enough to block that one vulnerable request, and leave all of the others unchanged. In terms of performance, the cost here is negligible. This is just basic variable taint analysis under the hood, (not an up-front intraprocedurale static analysis or anything complex) so there's basically no slow down. PHP SQL injections are the #1 way PHP applications get hacked - and all SQL injections are the result of a developer either not understanding how to prevent SQL injection, or taking a shortcut because it's fewer keystrokes to do it a feels safe rather than is safe way. What do you all think? There's obviously a bit more work to do; the PoC currently only covers mysqli_query, but I thought this stage is an interesting point to throw it open to comments before working to complete it. Matt Hi Matt, PHP SQL injections are the #1 way PHP applications get hacked - and all SQL injections are the result of a developer either not understanding how to prevent SQL injection, or taking a shortcut because it's fewer keystrokes to do it a feels safe rather than is safe way. This may have been true at one point in time, but my own experience and the statistics collected by Dan Kaminsky of White Hat Security indicates that Cross-Site Scripting vulnerabilities are much more prevalent in 2015 than SQL Injection, especially in business applications. If Google has information that indicates that SQLi is still more prevalent than XSS, I'd love to see this data. In my opinion, SQL injection is almost a solved problem. Use prepared statements where you can, and strictly whitelist where you cannot (i.e. ORDER BY {$column} ASC) Scott Arciszewski Chief Development Officer Paragon Initiative Enterprises https://paragonie.com -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] [RFC] Block requests to builtin SQL functions where PHP can prove the call is vulnerable to a potential SQL-injection attack
Perhaps I have missed something in this discussion where such a change to PHP does not break every single application that is supposed to pass raw, user submitted, SQL *without* getting prepared/nerfed, or warned about, by intentional application design. If we're just limiting the nerfing for submitted GPC variables (since PHP is used a lot for web applications) we still have a non-trivial number of those installed applications which require raw, user created, unescaped SQL, passing through to function as designed. I am thinking of the class of applications like phpMyAdmin, as well as the the millions of other database utility scripts, application install scripts, (etc.) out there that perform similar tasks, that need to pass raw SQL, as crafted by users, without preparation, intentionally. Of course, we could just add a bypass_the_nerfing() function, and such a function could then possibly see widespread adoption, everywhere, rendering the entire exercise moot.
Re: [PHP-DEV] Core functions throwing exceptions in PHP7
On Jul 30, 2015 2:27 PM, Niklas Keller m...@kelunik.com wrote: I prefer Exception, too, because it's I/O related. @Scott: You can open votes on everything, doesn't matter, just create a page with a vote. I just don't know where to put it in the wiki, because it's not a RFC. Regards, Niklas I'm not sure how to do that. I have a noncritical security patch I intend to write for random_bytes() that I would like to submit as a PR but first I'd like to see what the resolution will be re: Exceptions. Also, merge conflicts aren't fun. ;) If there's anything I can do to get those two merged faster, please let me know. Scott Arciszewski Chief Development Officer Paragon Initiative Enterprises https://paragonie.com
Re: [PHP-DEV] Disabling External Entities in libxml By Default
Stas, On Thu, Jul 30, 2015 at 2:57 PM, Stanislav Malyshev smalys...@gmail.com wrote: Hi! The problem here is that imagine the following: I think if we separate the loading the initial file (i.e., staring point of the XML parser) and the loading the entities from that file (which is not happening right now) we'd solve many BC problems. Not sure about SOAP, but many others for sure. Yeah, that seems reasonable. I'll take a peek at the code to see how bad it will be to separate it (though I'm not familiar with the xml extensions much). I know that you want it to work, but this is actually a great place to fail, because you're loading a trusted resource over HTTP. Meaning that an attacker could MITM and inject malicous XML into the response, and own your server without even needing to own the endpoint. I feel like XML parser is a wrong place to solve this problem, transport security can be done in HTTPS, signatures, etc. Otherwise many protocols that rely on XML - such as SAML, which is quite widely used - would be completely useless. Yeah, it's a pretty complex problem. I think there should likely be multiple levels of defense. One level is limiting external entity requests by default. Another level would be potentially to add a context option to dom document to allow you to whitelist URLs or servers. I think the point would be to document and make it secure by-default, but provide the ability to turn it back on if you know what you're doing (though that potentially has a bunch of possible problems as well). Anthony -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Disabling External Entities in libxml By Default
On 7/30/15 10:30 AM, Rowan Collins wrote: Rob Richards wrote on 30/07/2015 14:12: If you are already working with a trusted document then you should safely be able to disable the entity loader. If you aren't then wouldn't you want to do some sort of checking (especially if you dont have an XML gateway fronting the system) for other malicious things before even opening the document regardless if it has external entities or not. Can you give any pointers to what kind of checking this would be, and how it would be carried out without parsing the XML document in the first place? According to the bug report, one of the affected uses is the SoapClient, which by definition is dealing with remote data. I can see how that could be considered untrusted, but I can't think of any particular action that would make it more trusted (quite apart from the lack of an obvious point to intercept the data before it is parsed). Would it not make more sense for the parser to operate in an untrusted mode - disabling external entities, maybe different limits on stack depth, etc? Regards, All depends upon what you are trying to accomplish as this covers tree, streaming, different types of schemas, xsl, etc... For example, you can easily check if there is a DTD, imports/includes, specific xslt functionality, list goes on and on without ever having to load the document. There really is no one size fit all imo so what one considers untrusted someone else would consider trusted. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Disabling External Entities in libxml By Default
On 7/30/15 2:57 PM, Stanislav Malyshev wrote: Hi! The problem here is that imagine the following: I think if we separate the loading the initial file (i.e., staring point of the XML parser) and the loading the entities from that file (which is not happening right now) we'd solve many BC problems. Not sure about SOAP, but many others for sure. It will solve many but your guess is as good as mine as to what the split will be. All come down to what people are doing with XML. I've had comments from both sides where people hate the way its currently implemented and have suggested the idea of allowing initial file and then from others who like it as is. Regardless tho the current implementation should definitely not be enabled by default but I could see something laxer like this. I still say it should be a different function and leave the current one as is. I know that you want it to work, but this is actually a great place to fail, because you're loading a trusted resource over HTTP. Meaning that an attacker could MITM and inject malicous XML into the response, and own your server without even needing to own the endpoint. I feel like XML parser is a wrong place to solve this problem, transport security can be done in HTTPS, signatures, etc. Otherwise many protocols that rely on XML - such as SAML, which is quite widely used - would be completely useless. Rob -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Disabling External Entities in libxml By Default
On 30 July 2015 19:25:47 BST, Anthony Ferrara ircmax...@gmail.com wrote: I thought SOAP was dead already. Tell that to the Enterprises who drag and drop in Visual Studio to create useless wrappers around hand-written XML because that's their definition of web service. :P I don't fully understand where this vulnerability kicks in (other than ! ENTITY which I don't think I've ever needed to consume) but any change in default behaviour needs to account for real-life usage, or it will simply become standard practice to switch it back to insecure mode. Regards, -- Rowan Collins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Disabling External Entities in libxml By Default
On 30 July 2015 21:35:01 BST, Rob Richards rricha...@cdatazone.org wrote: On 7/30/15 10:30 AM, Rowan Collins wrote: Rob Richards wrote on 30/07/2015 14:12: If you are already working with a trusted document then you should safely be able to disable the entity loader. If you aren't then wouldn't you want to do some sort of checking (especially if you dont have an XML gateway fronting the system) for other malicious things before even opening the document regardless if it has external entities or not. Can you give any pointers to what kind of checking this would be, and how it would be carried out without parsing the XML document in the first place? According to the bug report, one of the affected uses is the SoapClient, which by definition is dealing with remote data. I can see how that could be considered untrusted, but I can't think of any particular action that would make it more trusted (quite apart from the lack of an obvious point to intercept the data before it is parsed). Would it not make more sense for the parser to operate in an untrusted mode - disabling external entities, maybe different limits on stack depth, etc? Regards, All depends upon what you are trying to accomplish as this covers tree, streaming, different types of schemas, xsl, etc... For example, you can easily check if there is a DTD, imports/includes, specific xslt functionality, list goes on and on without ever having to load the document. There really is no one size fit all imo so what one considers untrusted someone else would consider trusted. So effectively we should all write partial XML parsers to determine the contents of the file, in order to decide if it's the data we expected? Would it not make more sense to leave that to the XML library, with a whitelist of features we actually need, URLs we trust for includes, etc? I never want an XML file to execute system commands on my behalf; do I have to write a regex to make sure they don't? Regards, -- Rowan Collins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Introduction and some opcache SSE related stuff
On Jul 31, 2015 2:12 AM, Matt Wilmas php_li...@realplain.com wrote: Hi Dmitry, Bogdan, - Original Message - From: Dmitry Stogov Sent: Thursday, July 30, 2015 Hi Bogdan, On Wed, Jul 29, 2015 at 5:22 PM, Andone, Bogdan bogdan.and...@intel.com wrote: Hi Guys, My name is Bogdan Andone and I work for Intel in the area of SW performance analysis and optimizations. We would like to actively contribute to Zend PHP project and to involve ourselves in finding new performance improvement opportunities based on available and/or new hardware features. I am still in the source code digesting phase but I had a look to the fast_memcpy() implementation in opcache extension which uses SSE intrinsics: If I am not wrong fast_memcpy() function is not currently used, as I didn't find the -msse4.2 gcc flag in the Makefile. I assume you probably didn't see any performance benefit so you preserved generic memcpy() usage. This is not SSE4.2 this is SSE2. Any X86_64 target implements SSE2, so it's enabled by default on x86_64 systems (at least on Linux). It also may be enabled on x86 targets adding -msse2 option. Right, I was gonna say, I think that was a mistake, and all x86_64 should be using it at least... Of course, using anything newer that needs special options is nearly useless, since I guess the vast majority aren't building themselves, but using lowest-common-denominator repos. I had been wondering about speeding up some other things, maybe taking advantage of SSE4.x (string stuff, I don't know), but... like I said. Runtime checks would be awesome, but except for the recent GCC, the intrinsics aren't available unless the corresponding SSE option is enabled (lame!). So requires a separate compilation unit. :-/ Of course I guess if the intrinsic maps simply to the instruction, could just do it with inline asm, if wanted to do runtime CPU checking. I would like to propose a slightly different implementation which uses _mm_store_si128() instead of _mm_stream_si128(). This ensures that copied memory is preserved in data cache, which is not bad as the interpreter will start to use this data without the need to go back one more time to memory. _mm_stream_si128() in the current implementation is intended to be used for stores where we want to avoid reading data into the cache and the cache pollution; in opcache scenario it seems that preserving the data in cache has a positive impact. _mm_stream_si128() was used on purpose, to avoid CPU cache pollution, because data copied from SHM to process memory is not necessary used before eviction. By the way, I'm not completely sure. May be _mm_store_si128() can provide better result. Interesting (that _stream was used on purpose). :-) Running php-cgi -T1 on WordPress4.1/index.php I see ~1% performance increase for the new version of fast_memcpy() compared with the generic memcpy(). Same result using a full load test with http_load on a Haswell EP 18 cores. 1% is really big improvement. I'll able to check this only on next week (when back from vacation). Well, he talks like he was comparing to *generic* memcpy(), so...? But not sure how that would have been accomplished. BTW guys, I was wondering before why fast_memcpy() only in this opcache area? For the prefetch and/or cache pollution reasons? Just because, in this place we may copy big blocks, and we also may align them properly, to use compact and fast Inlined code. Because shouldn't the library functions in glibc, etc. already be using versions optimized for the CPU at runtime? So is generic memcpy() already fast? (Other than overhead for a function call.) glibc already uses optimized memcpy(), but this is universal function, that has to check for different conditions, like allignment of source and distination and length. Here is the proposed pull request: https://github.com/php/php-src/pull/1446 Related to the SW prefetching instructions in fast_memcpy()... they are not really useful in this place. There benefit is almost negligible as the address requested for prefetch will be needed at the next iteration (few cycles later), while the time needed to get data from RAM is 100 cycles usually.. Nevertheless... they don't heart and it seems they still have a very small benefit so I preserved the original instruction and I added a new prefetch request for the destination pointer. I also didn't see significant difference from software prefetching. So how about prefetching further/more interations ahead...? I tried, but didn't see difference as well. Thanks. Dmitry. Thanks. Dmitry. Hope it helps, Bogdan - Matt
Re: [PHP-DEV] [RFC] Block requests to builtin SQL functions where PHP can prove the call is vulnerable to a potential SQL-injection attack
On 29/07/15 16:11, Craig Francis wrote: I completely disagree... prepared statements are just as vulnerable, and so are ORM's. You can push developers towards these solutions, and that would be good, but you are completely blind if you think an uneducated developer won't do: if ($stmt = $mysqli-prepare(SELECT District FROM City WHERE Name= . $_GET['name'])) { } But that is a perfect example of what I am talking about. You do not educate people by publishing the very thing that is wrong. You educate them by pointing out to them WHY the '?' was there in the first place. Since the taint extension only covers mysql and sqlite it's of little use if we manage to convert 'uneducated developer' to any of the more secure databases, and that was one of the reasons why mysql was dropped from being loaded by default. Once one starts from a base of parametrised sql queries the lax programming methods many mysql guides and books continue to push can be reversed. Throwing more bloat into php to create 'WTF' errors just adds to a new users frustration and annoys experienced users who have very good reasons for building queries using clean variables. MANY abstraction layers use variables to add prefixes to table names or fields. Educate ... don't nanny ... -- Lester Caine - G8HFL - Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DEV] Benchmark Results for PHP Master 2015-07-30
Results for project php-src-nightly, build date 2015-07-30 05:00:00+03:00 commit: ae1a4f47e6bd9f8d1d969e5080dae60136d7444b revision_date:2015-07-29 21:00:43+02:00 environment: Haswell-EP cpu: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz 2x18 cores, stepping 2, LLC 45 MB mem: 128 GB os: CentOS 7.1 kernel: Linux 3.10.0-229.4.2.el7.x86_64 Note: Baseline results were generated using release php-7.0.0beta1, with hash ad8a73dd55c087de465ad80e8715611693bb1460 from 2015-07-07 16:02:13+00:00 benchmark executable unit change since change since yesterday php-7.0.0beta1 Wordpress 4.2.2 cgi -T1 php opc=on fps -1.36% -0.78% Drupal 7.36 cgi -T1 php opc=on fps -0.08% 0.25% MediaWiki 1.23.9 cgi -T5000 php opc=on fps -0.97% -1.16% bench.php cgi -T1 php opc=on sec 0.65% -3.81% micro_bench.php cgi -T1 php opc=on sec 2.28% 2.94% mandelbrot.php cgi -T1 php opc=on sec 0.06% -1.76% Our lab does a nightly source pull and build of the PHP project and measures performance changes against the previous stable version and the previous nightly measurement. This is provided as a service to the community so that quality issues with current hardware can be identified quickly. Intel technologies' features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. This document may contain information on products, services and/or processes in development. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request. (C) 2015 Intel Corporation.
Re: [PHP-DEV] Core functions throwing exceptions in PHP7
On Mon, Jul 27, 2015 at 2:03 PM, Anthony Ferrara ircmax...@gmail.com wrote: Rowan, This is certainly some people's concern, but Anatol has raised a subtly different consistency-related point, which is this: Since we have no policy for what kinds of Throwable should be emitted in what circumstance, throwing anything in this function sets a precedent which will have to be incorporated in any future plan. Assuming nobody is fundamentally against ever adding Throwables to core functions, there are a minimum number of questions that need to be agreed before adding the first one: - when should we inherit from Error and when from Exception? IMHO, Errors signify programmer error, where Exceptions signify unknown or runtime errors. Meaning that an Error should always be a problem with your code, but an Exception could be a systems problem, a user problem or a problem in your code. While that's slightly off-topic to this discussion, it frames which type random_* would throw pretty clearly (Exception). - is it ever OK to throw a plain Error or Exception (thus forcing users into the otherwise bad practice of catching those base classes)? For now, I think that's a good practice. It doesn't constrain us from sub-typing down the road (7.1, etc), but it also lets us build the support in today. For example, if we throw Exception, in 7.1 we could make it php\RandomException in 7.1 without issue (all we need to get right is the hierarchy parent). - if not throwing the base class, how specific should sub-classes be? (i.e. a framework for defining the hierarchy, not necessarily the hierarchy itself) I think this is something that should be RFC'd for 7.1. I don't think that limits us here though. If we can get agreement on those points in time for 7.0, fine, but time is very tight, and the window for such discussions has theoretically closed... I think the only real agreement we need is Error vs Exception. If we can agree on one of those, we can do the rest in 7.1 without worrying about BC... Anthony I'm fine with either Error or Exception. I'd prefer Exception (easier to write a sane backport for PHP 5.6) but I leave this decision in the hands of others. /** * Slightly insane PHP 5 backport but it works */ class Error extends Exception { } // Done! Does anybody feel particularly strong about one or the other? If so, should we set up a vote somewhere? (I don't vote karma on RFCs etc. so I don't know if the existing infrastructure would work.) If not, can we get PR 1397 1398 merged? :) Regards, Scott Arciszewski Chief Development Officer Paragon Initiative Enterprises https://paragonie.com -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Core functions throwing exceptions in PHP7
2015-07-30 19:12 GMT+02:00 Scott Arciszewski sc...@paragonie.com: On Mon, Jul 27, 2015 at 2:03 PM, Anthony Ferrara ircmax...@gmail.com wrote: Rowan, This is certainly some people's concern, but Anatol has raised a subtly different consistency-related point, which is this: Since we have no policy for what kinds of Throwable should be emitted in what circumstance, throwing anything in this function sets a precedent which will have to be incorporated in any future plan. Assuming nobody is fundamentally against ever adding Throwables to core functions, there are a minimum number of questions that need to be agreed before adding the first one: - when should we inherit from Error and when from Exception? IMHO, Errors signify programmer error, where Exceptions signify unknown or runtime errors. Meaning that an Error should always be a problem with your code, but an Exception could be a systems problem, a user problem or a problem in your code. While that's slightly off-topic to this discussion, it frames which type random_* would throw pretty clearly (Exception). - is it ever OK to throw a plain Error or Exception (thus forcing users into the otherwise bad practice of catching those base classes)? For now, I think that's a good practice. It doesn't constrain us from sub-typing down the road (7.1, etc), but it also lets us build the support in today. For example, if we throw Exception, in 7.1 we could make it php\RandomException in 7.1 without issue (all we need to get right is the hierarchy parent). - if not throwing the base class, how specific should sub-classes be? (i.e. a framework for defining the hierarchy, not necessarily the hierarchy itself) I think this is something that should be RFC'd for 7.1. I don't think that limits us here though. If we can get agreement on those points in time for 7.0, fine, but time is very tight, and the window for such discussions has theoretically closed... I think the only real agreement we need is Error vs Exception. If we can agree on one of those, we can do the rest in 7.1 without worrying about BC... Anthony I'm fine with either Error or Exception. I'd prefer Exception (easier to write a sane backport for PHP 5.6) but I leave this decision in the hands of others. /** * Slightly insane PHP 5 backport but it works */ class Error extends Exception { } // Done! Does anybody feel particularly strong about one or the other? If so, should we set up a vote somewhere? (I don't vote karma on RFCs etc. so I don't know if the existing infrastructure would work.) If not, can we get PR 1397 1398 merged? :) Regards, Scott Arciszewski Chief Development Officer Paragon Initiative Enterprises https://paragonie.com -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php I prefer Exception, too, because it's I/O related. @Scott: You can open votes on everything, doesn't matter, just create a page with a vote. I just don't know where to put it in the wiki, because it's not a RFC. Regards, Niklas
Re: [PHP-DEV] Disabling External Entities in libxml By Default
Hello :-), Huge +1 from the [Hoa] community. We have already disabled it by default since a long time. However, could it introduce potential regressions (BC breaks)? I guess yes. So I would go for PHP7.0 instead of PHP7.1. Cheers! [Hoa]: http://hoa-project.net/ On 29/07/15 22:37, Anthony Ferrara wrote: All, I wanted to float an idea by you for PHP 7 (or 7.1 depending on the RM's feedback). Currently, PHP by default is vulnerable to XXE attacks: https://www.owasp.org/index.php/XML_External_Entity_(XXE)_Processing To bypass this, you need to turn off external entity loading: libxml_disable_entity_loader(true); What I'm proposing is to disable entity loading by default. That way it requires developers to opt-in to actually load external entities. Thoughts? Anthony -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] [RFC] Block requests to builtin SQL functions where PHP can prove the call is vulnerable to a potential SQL-injection attack
Even if some of those people replying haven't read or don't understand what you are suggesting, it is not a good tactic to assume that and reply with read the RFC. There is a good chance the majority of the people replying have read the RFC, and found reason to be negative, reserved, cautious, or whatever. The best thing you can do now is read those responses again, and try to find what they are saying, if you want the conversation to continue. Cheers Joe On Thu, Jul 30, 2015 at 4:38 PM, Craig Francis cr...@craigfrancis.co.uk wrote: On 30 Jul 2015, at 16:24, Scott Arciszewski sc...@paragonie.com wrote: Just because the solution is known doesn't mean it's known to everyone. Yes, and if you could just read what I was suggesting, rather than looking at the subject of this email (and the suggestion by Matt), then you will notice this is what I'm trying to do (so not just people asking questions on Stack Overflow). My suggestion is to educate, it also has a nice side effect of having a simple checking process for everything else (without breaking anything). On 30 Jul 2015, at 16:24, Scott Arciszewski sc...@paragonie.com wrote: On Thu, Jul 30, 2015 at 11:20 AM, Craig Francis cr...@craigfrancis.co.uk wrote: On 30 Jul 2015, at 14:43, Scott Arciszewski sc...@paragonie.com wrote: This may have been true at one point in time, but my own experience and the statistics collected by Dan Kaminsky of White Hat Security indicates that Cross-Site Scripting vulnerabilities are much more prevalent in 2015 than SQL Injection, especially in business applications. Good, because my suggestion was also addressing XSS with poor (or completely missing) HTML escaping... have a look: http://news.php.net/php.internals/87207 https://bugs.php.net/bug.php?id=69886 Now I admit it won't fix everything with XSS (as HTML escaping is a bit harder), but it certainly will pick up quite a lot of the issues (and it wont break anything either, just help developers identify problems). And no, SQL injection is far from a solved problem... this is why, after 15 years of me trying to tell my fellow developers to not make these mistakes, I'm still finding them making them over and over again... hence why I'm making the above suggestion. Craig On 30 Jul 2015, at 14:43, Scott Arciszewski sc...@paragonie.com wrote: On Tue, Jul 28, 2015 at 1:33 PM, Matt Tait matt.t...@gmail.com wrote: Hi all, I've written an RFC (and PoC) about automatic detection and blocking of SQL injection vulnerabilities directly from inside PHP via automated taint analysis. https://wiki.php.net/rfc/sql_injection_protection In short, we make zend_strings track where their value originated. If it originated as a T_STRING, from a primitive (like int) promotion, or as a concatenation of such strings, it's query that can't have been SQL-injected by an attacker controlled string. If we can't prove that the query is safe, that means that the query is either certainly vulnerable to a SQL-injection vulnerability, or sufficiently complex that it should be parameterized just-to-be-sure. There's also a working proof of concept over here: http://phpoops.cloudapp.net/oops.php You'll notice that the page makes a large number of SQL statements, most of which are not vulnerable to SQL injection, but one is. The proof of concept is smart enough to block that one vulnerable request, and leave all of the others unchanged. In terms of performance, the cost here is negligible. This is just basic variable taint analysis under the hood, (not an up-front intraprocedurale static analysis or anything complex) so there's basically no slow down. PHP SQL injections are the #1 way PHP applications get hacked - and all SQL injections are the result of a developer either not understanding how to prevent SQL injection, or taking a shortcut because it's fewer keystrokes to do it a feels safe rather than is safe way. What do you all think? There's obviously a bit more work to do; the PoC currently only covers mysqli_query, but I thought this stage is an interesting point to throw it open to comments before working to complete it. Matt Hi Matt, PHP SQL injections are the #1 way PHP applications get hacked - and all SQL injections are the result of a developer either not understanding how to prevent SQL injection, or taking a shortcut because it's fewer keystrokes to do it a feels safe rather than is safe way. This may have been true at one point in time, but my own experience and the statistics collected by Dan Kaminsky of White Hat Security indicates that Cross-Site Scripting vulnerabilities are much more prevalent in 2015 than SQL Injection, especially in business applications. If Google has information that indicates that SQLi is still more prevalent than XSS, I'd love to see this data.
Re: [PHP-DEV] Disabling External Entities in libxml By Default
Anatol Belski wrote: -Original Message- From: Pierre Joye [mailto:pierre@gmail.com] Sent: Wednesday, July 29, 2015 11:01 PM To: Anthony Ferrara ircmax...@gmail.com Cc: PHP internals internals@lists.php.net Subject: Re: [PHP-DEV] Disabling External Entities in libxml By Default On Jul 29, 2015 11:38 PM, Anthony Ferrara ircmax...@gmail.com wrote: All, I wanted to float an idea by you for PHP 7 (or 7.1 depending on the RM's feedback). Currently, PHP by default is vulnerable to XXE attacks: https://www.owasp.org/index.php/XML_External_Entity_(XXE)_Processing To bypass this, you need to turn off external entity loading: libxml_disable_entity_loader(true); What I'm proposing is to disable entity loading by default. That way it requires developers to opt-in to actually load external entities. Thoughts? I am for it, for 7.0 or 8.0. We discussed it during the last related flaw and decided not to do it for BC reasons (whatever it means in this case). This problem went off our radar, so yes, we should do it in 7.0. Changing default in minor versions always create more troubles. To note were that the libxml-2.9.2 in Windows builds already contains patches mentioned in https://www.debian.org/security/2013/dsa-2652 , see https://github.com/winlibs/libxml2/commit/727e357fb21b95d5c315518bdac99a70a6d15ff8 ... Most of the distributions should already have these patches. Probably we should check whether disabling it in PHP were unnecessary, but if it's not - ofc 7.0 should be the target at least. It seems to me that this patch addresses only part of the XXE problem. However, according to OWASP it would be sufficient to protect against XXE by not setting XML_PARSE_NOENT and XML_PARSE_DTDLOAD (checked as of libxml 2.9). AFAIK PHP does not set these options, unless requested by the user), whereas XML_PARSE_NOENT can also be set via DOMDocument::substituteEntities. Some note about the potential danger of these options/properties might be appropriate in the manual. -- Christoph M. Becker -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Disabling External Entities in libxml By Default
Hello Disabling this will (at least for me) cause SOAP related stuff to stop working as it was expected to work before! ?php $wsdl = https://www.some.tld/soap.php?wsdl;; $soap = SoapServer($wsdl, array()); wsdl: ?xml version=1.0 encoding=utf-8? wsdl:definitions xmlns:http=http://schemas.xmlsoap.org/wsdl/http/; xmlns:soap=http://schemas.xmlsoap.org/wsdl/soap/; xmlns:s=http://www.w3.org/2001/XMLSchema; xmlns:soapenc=http://schemas.xmlsoap.org/soap/encoding/; xmlns:tm=http://microsoft.com/wsdl/mime/textMatching/; xmlns:mime=http://schemas.xmlsoap.org/wsdl/mime/; xmlns:wsdl=http://schemas.xmlsoap.org/wsdl/; xmlns:tns=http://www.some.tld/soap/muppet/user/1.0/; targetNamespace=http://www.some.tld/soap/muppet/user/1.0/; wsdl:types s:schema targetNamespace=http://www.some.tld/soap/muppet/user/1.0/; xmlns:tns=http://www.some.tld/soap/muppet/user/1.0/; elementFormDefault=qualified ... It fails with error to read external entity, failed while parsing /external entity /'http://www.some.tld/muppet.php?wsdl' .. I dont know if i get this error correct but to me it looks like PHP on www frontend refuse to read WSDL/SOAP/XML from www backend because of this... Petty much of the SOAP idea is gone then..? / Jake On 2015-07-29 22:37, Anthony Ferrara wrote: All, I wanted to float an idea by you for PHP 7 (or 7.1 depending on the RM's feedback). Currently, PHP by default is vulnerable to XXE attacks: https://www.owasp.org/index.php/XML_External_Entity_(XXE)_Processing To bypass this, you need to turn off external entity loading: libxml_disable_entity_loader(true); What I'm proposing is to disable entity loading by default. That way it requires developers to opt-in to actually load external entities. Thoughts? Anthony
Re: [PHP-DEV] Benchmark Results for PHP Master 2015-07-30
On 30/07/2015 11:12 pm, Niklas Keller wrote: 2015-07-30 14:42 GMT+02:00 Andone, Bogdan bogdan.and...@intel.com: -Original Message- From: Niklas Keller [mailto:m...@kelunik.com] Sent: Thursday, July 30, 2015 1:47 PM To: Pierre Joye Cc: lp_benchmark_robot; PHP internals; l...@lists.01.org Subject: Re: [PHP-DEV] Benchmark Results for PHP Master 2015-07-30 2015-07-30 11:57 GMT+02:00 Pierre Joye pierre@gmail.com: Hi, Does someone has a contact there? It would be nicer to have these results combined with what we pushed on qa.php.net as well. Cheers, Pierre Thought about that as well, results per mail aren't that useful, especially as they're badly formatted for me in GMail (no fixed font). A graph visualizing those numbers would be nice. Regards, Niklas Hi Guys, We are glad that our small ticking spam start to be observed :) ! We would like to offer valuable information to the community related to performance trends of the PHP project on Intel platforms based on daily builds and we are open for suggestions for making these results relevant. We chose to share our numbers as plain text mails for easily seeing the summary snapshots on discussion lists without the need of other clicks. Everybody agrees that plain text is ugly and, yes, you need to have fixed font in place for having the table formatted correctly. Let’s discuss a better way of doing; integration with qa.php.net is possible if we find the right interface for sharing data in an automated way. Normally l...@lists.01.org should be the official entry for feedbacks and requests but, unfortunately, it is not yet operational, so I will be your direct contact as I am part of the team which deploys this project. Kind regards, Bogdan Hi Bogdan, I think absolute numbers (instead of a %-change) would be better suited for visualizing performance over time. Regards, Niklas I agree on using absolute numbers. With percentages it is not immediately obvious whether the change was good or bad. Including the build options would be good. Chris -- http://twitter.com/ghrd -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Introduction and some opcache SSE related stuff
Hi Dmitry, Bogdan, - Original Message - From: Dmitry Stogov Sent: Thursday, July 30, 2015 Hi Bogdan, On Wed, Jul 29, 2015 at 5:22 PM, Andone, Bogdan bogdan.and...@intel.com wrote: Hi Guys, My name is Bogdan Andone and I work for Intel in the area of SW performance analysis and optimizations. We would like to actively contribute to Zend PHP project and to involve ourselves in finding new performance improvement opportunities based on available and/or new hardware features. I am still in the source code digesting phase but I had a look to the fast_memcpy() implementation in opcache extension which uses SSE intrinsics: If I am not wrong fast_memcpy() function is not currently used, as I didn't find the -msse4.2 gcc flag in the Makefile. I assume you probably didn't see any performance benefit so you preserved generic memcpy() usage. This is not SSE4.2 this is SSE2. Any X86_64 target implements SSE2, so it's enabled by default on x86_64 systems (at least on Linux). It also may be enabled on x86 targets adding -msse2 option. Right, I was gonna say, I think that was a mistake, and all x86_64 should be using it at least... Of course, using anything newer that needs special options is nearly useless, since I guess the vast majority aren't building themselves, but using lowest-common-denominator repos. I had been wondering about speeding up some other things, maybe taking advantage of SSE4.x (string stuff, I don't know), but... like I said. Runtime checks would be awesome, but except for the recent GCC, the intrinsics aren't available unless the corresponding SSE option is enabled (lame!). So requires a separate compilation unit. :-/ Of course I guess if the intrinsic maps simply to the instruction, could just do it with inline asm, if wanted to do runtime CPU checking. I would like to propose a slightly different implementation which uses _mm_store_si128() instead of _mm_stream_si128(). This ensures that copied memory is preserved in data cache, which is not bad as the interpreter will start to use this data without the need to go back one more time to memory. _mm_stream_si128() in the current implementation is intended to be used for stores where we want to avoid reading data into the cache and the cache pollution; in opcache scenario it seems that preserving the data in cache has a positive impact. _mm_stream_si128() was used on purpose, to avoid CPU cache pollution, because data copied from SHM to process memory is not necessary used before eviction. By the way, I'm not completely sure. May be _mm_store_si128() can provide better result. Interesting (that _stream was used on purpose). :-) Running php-cgi -T1 on WordPress4.1/index.php I see ~1% performance increase for the new version of fast_memcpy() compared with the generic memcpy(). Same result using a full load test with http_load on a Haswell EP 18 cores. 1% is really big improvement. I'll able to check this only on next week (when back from vacation). Well, he talks like he was comparing to *generic* memcpy(), so...? But not sure how that would have been accomplished. BTW guys, I was wondering before why fast_memcpy() only in this opcache area? For the prefetch and/or cache pollution reasons? Because shouldn't the library functions in glibc, etc. already be using versions optimized for the CPU at runtime? So is generic memcpy() already fast? (Other than overhead for a function call.) Here is the proposed pull request: https://github.com/php/php-src/pull/1446 Related to the SW prefetching instructions in fast_memcpy()... they are not really useful in this place. There benefit is almost negligible as the address requested for prefetch will be needed at the next iteration (few cycles later), while the time needed to get data from RAM is 100 cycles usually.. Nevertheless... they don't heart and it seems they still have a very small benefit so I preserved the original instruction and I added a new prefetch request for the destination pointer. I also didn't see significant difference from software prefetching. So how about prefetching further/more interations ahead...? Thanks. Dmitry. Hope it helps, Bogdan - Matt -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Disabling External Entities in libxml By Default
Hi! The problem here is that imagine the following: I think if we separate the loading the initial file (i.e., staring point of the XML parser) and the loading the entities from that file (which is not happening right now) we'd solve many BC problems. Not sure about SOAP, but many others for sure. I know that you want it to work, but this is actually a great place to fail, because you're loading a trusted resource over HTTP. Meaning that an attacker could MITM and inject malicous XML into the response, and own your server without even needing to own the endpoint. I feel like XML parser is a wrong place to solve this problem, transport security can be done in HTTPS, signatures, etc. Otherwise many protocols that rely on XML - such as SAML, which is quite widely used - would be completely useless. -- Stas Malyshev smalys...@gmail.com -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Benchmark Results for PHP Master 2015-07-30
2015-07-30 11:57 GMT+02:00 Pierre Joye pierre@gmail.com: Hi, Does someone has a contact there? It would be nicer to have these results combined with what we pushed on qa.php.net as well. Cheers, Pierre Thought about that as well, results per mail aren't that useful, especially as they're badly formatted for me in GMail (no fixed font). A graph visualizing those numbers would be nice. Regards, Niklas
Re: [PHP-DEV] Benchmark Results for PHP Master 2015-07-30
Hi, Does someone has a contact there? It would be nicer to have these results combined with what we pushed on qa.php.net as well. Cheers, Pierre On Jul 30, 2015 3:29 PM, lp_benchmark_robot lp_benchmark_ro...@intel.com wrote: Results for project php-src-nightly, build date 2015-07-30 05:00:00+03:00 commit: ae1a4f47e6bd9f8d1d969e5080dae60136d7444b revision_date:2015-07-29 21:00:43+02:00 environment: Haswell-EP cpu: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz 2x18 cores, stepping 2, LLC 45 MB mem: 128 GB os: CentOS 7.1 kernel: Linux 3.10.0-229.4.2.el7.x86_64 Note: Baseline results were generated using release php-7.0.0beta1, with hash ad8a73dd55c087de465ad80e8715611693bb1460 from 2015-07-07 16:02:13+00:00 benchmark executable unit change since change since yesterday php-7.0.0beta1 Wordpress 4.2.2 cgi -T1 php opc=on fps -1.36% -0.78% Drupal 7.36 cgi -T1 php opc=on fps -0.08% 0.25% MediaWiki 1.23.9 cgi -T5000 php opc=on fps -0.97% -1.16% bench.php cgi -T1 php opc=on sec 0.65% -3.81% micro_bench.php cgi -T1 php opc=on sec 2.28% 2.94% mandelbrot.php cgi -T1 php opc=on sec 0.06% -1.76% Our lab does a nightly source pull and build of the PHP project and measures performance changes against the previous stable version and the previous nightly measurement. This is provided as a service to the community so that quality issues with current hardware can be identified quickly. Intel technologies' features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. This document may contain information on products, services and/or processes in development. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request. (C) 2015 Intel Corporation.
Re: [PHP-DEV] Reclassify E_STRICT notices
On Sun, Feb 22, 2015 at 11:30 PM, Nikita Popov nikita@gmail.com wrote: Hi internals! I would like to propose reclassifying our few existing E_STRICT notices and removing this error category: https://wiki.php.net/rfc/reclassify_e_strict As we don't really have good guidelines on when which type of error should be thrown, I'm mainly going by what category other similar errors use. I'm open to suggestions, but hope this will not deteriorate into total bikeshed. Thanks, Nikita hi, this RFC got accepted, but there are 4 more E_STRICTs in the core which were kept/missed: http://lxr.php.net/xref/PHP_TRUNK/ext/date/php_date.c#1544 http://lxr.php.net/xref/PHP_TRUNK/ext/standard/html.c#1241 http://lxr.php.net/xref/PHP_TRUNK/ext/mysqli/mysqli_api.c#1606 http://lxr.php.net/xref/PHP_TRUNK/ext/mysqli/mysqli_api.c#1644 for the sake of consistency I would like these to be removed/changed into some other error levels so that we emit no E_STRICT from the core. I cc'ed Andrey and Derick to this mail as I'm curious about their opinion on the mysql and date related ones. -- Ferenc Kovács @Tyr43l - http://tyrael.hu
Re: [PHP-DEV] json_decode/encode should return full precision values by default
On Thu, Jul 30, 2015 at 1:25 AM, Yasuo Ohgaki yohg...@ohgaki.net wrote: Hi all, On Thu, Jul 30, 2015 at 7:44 AM, Yasuo Ohgaki yohg...@ohgaki.net wrote: On Thu, Jul 30, 2015 at 1:13 AM, Nikita Popov nikita@gmail.com wrote: Instead of continuing to use serialize_precision, which will produce unnecessarily long outputs for many values, why don't we just switch to using the 0 mode of zend_dtoa, i.e. to return the shortest output that is still accurate if interpreted in round-to-nearest. I think this is what everybody else is using when they convert floating point numbers to strings. I guess we may not be able to change normal floating point printing to use this, but this seems like the best mode for anything using serialize_precision now and everything that should be using it (like JSON, and queries, etc). I prefer your proposal! Your proposal is a lot better than now. Anyone has opinion for this? I'm writing the RFC and I would like to make this the first option. i.e. serialize_precision=0 uses zend_dtoa 0 mode for all data exchange functions (json/serialize/var_exrport. Anyone care about WDDX/XML_RPC?) I wrote draft RFC. https://wiki.php.net/rfc/precise_float_value Please comment. I would like to start RFC discussion shortly. Thank you. Nice idea about using a special serialize_precision value for this. This allows to keep BC for those that have tests for particular serialize output or similar things. I would suggest to default serialize_precision to -1 in PHP 7 -- if people want the previous behavior they can still have it, but I think -1 is the more reasonable default as it matches what one would naturally expect. I don't see the need for having a separate setting for JSON. Having a dozen different float precision settings will not help anyone. Nikita
Re: [PHP-DEV] [RFC] Block requests to builtin SQL functions where PHP can prove the call is vulnerable to a potential SQL-injection attack
On 30 Jul 2015, at 08:24, Lester Caine les...@lsces.co.uk wrote: But that is a perfect example of what I am talking about. You do not educate people by publishing the very thing that is wrong. You educate them by pointing out to them WHY the '?' was there in the first place. I completely agree on education, and what I'm hoping for... and this is how we can educate everyone :-) My suggestion for taints (not quite the same as the one from Matt or Wietse) was not to change the way good programs are created/executed, but simply an education device, which can also pick up mistakes that experienced developers make. While my first post on this mailing list gives a better overview: http://news.php.net/php.internals/87207 The original implementation suggestion is at: https://bugs.php.net/bug.php?id=69886 You will see that it does nothing more than create notices to say erm, do you want to be doing this?. This is something that only PHP can do, unless you can find a way of changing every single article / code example on the internet :-) So, with your example... if you want to use a variable for a table/field prefix, that is perfectly fine... in fact, it won't need any changes, as the prefix will probably be hard coded as a string within a PHP script (something I called ETYPE_CONSTANT). But if not (e.g. storing the prefix in an ini file?), then I've shown an example of how that can be handled with the proposed string_encoding_set function (something I should have probably called string_escaping_set)... which is simply to tell PHP that this one variable is already safe (something I can't see being needed very often). Craig On 30 Jul 2015, at 08:24, Lester Caine les...@lsces.co.uk wrote: On 29/07/15 16:11, Craig Francis wrote: I completely disagree... prepared statements are just as vulnerable, and so are ORM's. You can push developers towards these solutions, and that would be good, but you are completely blind if you think an uneducated developer won't do: if ($stmt = $mysqli-prepare(SELECT District FROM City WHERE Name= . $_GET['name'])) { } But that is a perfect example of what I am talking about. You do not educate people by publishing the very thing that is wrong. You educate them by pointing out to them WHY the '?' was there in the first place. Since the taint extension only covers mysql and sqlite it's of little use if we manage to convert 'uneducated developer' to any of the more secure databases, and that was one of the reasons why mysql was dropped from being loaded by default. Once one starts from a base of parametrised sql queries the lax programming methods many mysql guides and books continue to push can be reversed. Throwing more bloat into php to create 'WTF' errors just adds to a new users frustration and annoys experienced users who have very good reasons for building queries using clean variables. MANY abstraction layers use variables to add prefixes to table names or fields. Educate ... don't nanny ... -- Lester Caine - G8HFL - Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php