RE: [PHP-DEV] Possibilities to fix some really poor behaviors in PHP7
From: julienpa...@gmail.com [mailto:julienpa...@gmail.com] On Behalf Of Julien Pauli, Sent: Tuesday, October 14, 2014 10:05 AM On Tue, Oct 14, 2014 at 9:25 AM, Stas Malyshev smalys...@sugarcrm.com wrote: Hi! ... like the hidden array element: http://3v4l.org/6uFqf ... like the hidden object property: http://3v4l.org/RPJXH The issue seems to be that array lookup always looks for numeric results when looking for numeric-like keys. But when adding property, the numeric check is not done since properties are not supposed to be numeric. Thus, when converting the object to array, the property named 123 becomes inaccessible, because in array it is supposed to be under number 123. We could, of course, add numeric checks to properties, but it would slow things down only to serve very narrow use case with hardly any legit uses. We could also rewrite hashtable with numeric keys instead of string keys when doing conversion, but again that would be significant slowdown for a very rare use case. Not sure it's worth it. I agree that fixing a strange behavior - very little people know about and very few little people use in real case - involving performance penalty for any other use case ; should be a -1 of course. Let's say the behavior is here by design ;-) Julien.Pauli If it is by design, it should be documented in http://php.net/manual/en/language.types.array.php#language.types.array.casting and maybe at a corresponding place on http://php.net/manual/en/language.types.object.php
Re: [PHP-DEV] Unicode support
Good point. That's what i meant by border-line case. Could you possibly point me to a specific example of such false positive? I'm interested in well-formed UTF-8 string. I believe noël test is ill-formed UTF-8 and doesn't conform to shortest-form requirement. You're confusing two concepts here: well-formed UTF-8 represents any single code point with the smallest number of bytes, but it makes no requirements about what code points are represented. Representing ë as two code points is perfectly valid Unicode, and would in fact be required under NFD. That most input sources would prefer the combined form seems like a weak assumption to base a library on; it only takes one popular third-party to routinely return data in NFD for the problems to start showing up. It's pretty meaningless to say you support Unicode, but only the easy bits. You might as well just tag each string with one of the pages of ISO-8859. As far as i'm concerned Unicode specification does not require to implement all annexes or even support entire character set to be conformant. I think there are always trade-offs involved, depending on what is more important for you. Sure, but there are certain user expectations of what Unicode support means. Handling Korean characters in a meaningfulmeaningful way would definitely be on that list. As I said at the top of my first post, the important thing is to capture what those requirements actually are. Just as you'd choose what array functions were needed if you were adding array support to a language. To put it a different way, in what situation would you actively want to know the number of code points in a string, rather than either the number of bytes in its UTF8 representation, or the number of graphemes?
Re: [PHP-DEV] Fixes for Visual Studio 2014
Hi Chris, On Tue, October 14, 2014 15:35, Chris Tankersley wrote: Hello all. Partially fueled by a joke to get PHP to compile on Windows 10, and partially fueled by starting to look more into core, I found some issues with the javascript-based configuration under Windows 10 and Visual Studio 2014 as well as issues where VS2014 includes better C support than older versions of VS. https://github.com/php/php-src/pull/869 The PR adds in some undefined checks for javascript variables, and some version checks for VS2014 to not load as much custom Windows header files from PHP. If someone could look at the PR I'd really appreciate it. I saw the PR but had no chance yet to test it. Would come to it next weeks if someone wasn't faster. Good to hear at least it works there with such a small subset of changes. Regards Anatol -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] RFC: PHP 7.0 timeline
On Wed, Oct 15, 2014 at 2:39 AM, Rasmus Lerdorf ras...@lerdorf.com wrote: On 10/14/2014 05:20 PM, Tjerk Meesters wrote: On 15 Oct 2014, at 01:24, Rasmus Lerdorf ras...@lerdorf.com wrote: On 10/14/2014 10:14 AM, Stas Malyshev wrote: Hi! IMO, AST, INT64, NG, Uniforme variables style is enough for a new marjor version.. why we still need to wait? We don't need to just wait, as sit and do nothing. We need to allocate time for other features. There are also quite a few really low-level changes in master right now. It is going to take quite a bit of time to stabilize. For example, something as basic as array iteration is inadvertently different: https://bugs.php.net/68215 This is a known issue for which the test cases were marked as XFAIL because of the amount of work involved to get it fixed: https://github.com/php/php-src/commit/5831cca9576f4e0d4daed75a9915d436dfc5f4e5 Yes, I am aware of that. I was using it to illustrate the point that even with the changes currently in master, never mind any new things we might add, we have a lot of work left to do before we are anywhere near ready for a release. I agree, we must absolutely provide something stable, solid, performant and feature rich. This will take time to stabilize and spot edge cases, some of them could even lead to a reverse somewhere in the code. Julien.Pauli -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] [RFC] Remove deprecated functionality in PHP 7
On Tue, Oct 14, 2014 at 4:00 PM, Johannes Schlüter johan...@schlueters.de wrote: On Mon, 2014-10-13 at 23:06 -0700, Stas Malyshev wrote: - drop incompatible $this context calls (probably seriously messed up code) Before removing: Could anybody check whether this breaks PEAR (incl. `pecl install`) I don't know how much PHP 4 legacy which required such tricks is still in there. We plan to propose to bundle pickle, that should solve the pecl issue. Pickle relies strictly on 5.4+. Cheers, -- Pierre @pierrejoye | http://www.libgd.org -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Internationalized Domain Name support in FILTER_VALIDATE_URL
Hi Chris, I've just blogged about IDN support in PHP. This post include a (tiny) userland implementation of streams: http://dunglas.fr/2014/10/internationalized-domain-name-idn-and-php/ What do you think about the following to add native support : 1. As already stated, make ICU a dependency of core 2. Convert the host returned by php_parse_url here https://github.com/php/php-src/blob/master/ext/standard/http_fopen_wrapper.c#L154 to Punycode with http://icu-project.org/apiref/icu4c432/uidna_8h.html#a711fa1d2e6dd25d7368f5b3ea2aaedc6 It looks not so intrusive and relatively easy to implement. According to RFC I quote in the blog post, it should work with SSL too. I can make a PR (or a RFC if needed) with this method if it seems applicable. Best regards, 2014-09-24 8:33 GMT+02:00 Pierre Joye pierre@gmail.com: On Wed, Sep 24, 2014 at 2:48 AM, Stas Malyshev smalys...@sugarcrm.com wrote: Hi! I'll implement optional (and not default) support of IDN in filter_var(). Does anyone known if it's better to use libIDN (LGPL) or ICU (custom license deviated from the X license) from a license point of view? ICU is definitely better since we already have a lot of code using ICU and AFAIK our current IDN functions (idn_to_*) use ICU. Which means it would be advantageous to keep it in the single library - whatever bugs there may be, at least the user will be dealing with one set of bugs instead of two :) Indeed :) However I am not sure yet we should do it, or at least not by default. It may introduce side effects or BC issues.While IDN is bi-directional or could be called many times and returning the same result, we have to be careful to do not break things out there, for example someone relying on it to process URI/URL. Cheers, -- Pierre @pierrejoye | http://www.libgd.org -- Kévin Dunglas Consultant et développeur freelance http://dunglas.fr Tél. : 06 60 91 20 20
Re: [PHP-DEV] Unicode support
On 15/10/14 10:04, Rowan Collins wrote: Rowan, As I said at the top of my first post, the important thing is to capture what those requirements actually are. Just as you'd choose what array functions were needed if you were adding array support to a language. I'm sorry for not making myself clear. What i'm essentially saying is that i think noël test is synthetic and impractical, it's also solvable with requirement of NFC strings at input and this is not implementation defect. I also believe that Hangul is most likely to be precomposed and will work alright. And i have another opinion on UTF-8 shortest-form. This is my personal opinion of course. That aside. I think requirements is what i was asking about, i'm assuming that your standpoint is that string modification routines are at least required to take into account entire characters, not only code points. Am i correct? What is confusing me is that i think you're seeing it as a major implementation defect. To avoid arguable implementations, i've made short example in Java: System.out.println(new StringBuffer(noël).reverse().toString()); It does produce string l̈eon as i would expect. Precomposed noël also works as i would expect producing string lëon. What do you think, is this implementation issue or solely requirements issue? -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] RFC: PHP 7.0 timeline
If we aren't able to fix a low-level problem in a year we most probably won't be able to fix it in two as well. Also, the closer we are to release, the more feedback we get, and the more bugs are able to fix. Delaying release would just reduce the attention of the users. Thanks. Dmitry. On Tue, Oct 14, 2014 at 9:24 PM, Rasmus Lerdorf ras...@lerdorf.com wrote: On 10/14/2014 10:14 AM, Stas Malyshev wrote: Hi! IMO, AST, INT64, NG, Uniforme variables style is enough for a new marjor version.. why we still need to wait? We don't need to just wait, as sit and do nothing. We need to allocate time for other features. There are also quite a few really low-level changes in master right now. It is going to take quite a bit of time to stabilize. For example, something as basic as array iteration is inadvertently different: https://bugs.php.net/68215 PHP is a mature project with reams and reams of legacy code out there. Every single change, no matter how small, is like throwing a hand grenade in a lake. There is the initial explosion and chaos and then the ripples that go on and on. Dealing with all these ripples takes a lot more time than most people think. For people who think that 1 year from now is slow and conservative, it really isn't. It is quite aggressive given the number of really low-level changes that are already in master. Even if we froze the tree today I expect it could stretch to close to a year to stabilize. -Rasmus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Unicode support
Aleksey Tulinov wrote (on 15/10/2014): On 15/10/14 10:04, Rowan Collins wrote: Rowan, As I said at the top of my first post, the important thing is to capture what those requirements actually are. Just as you'd choose what array functions were needed if you were adding array support to a language. I'm sorry for not making myself clear. What i'm essentially saying is that i think noël test is synthetic and impractical I remain unconvinced on that, and it's just one example. There are plenty of forms which don't have a combined form, otherwise there would be no need for combining diacritics to exist in the first place. it's also solvable with requirement of NFC strings at input and this is not implementation defect. I also believe that Hangul is most likely to be precomposed and will work alright. Requiring a particular normal form on input is not something a programming language can do. The only way you can guarantee NFC form is by performing the normalisation. And i have another opinion on UTF-8 shortest-form. There's no need for opinion there, we can consult the standard. http://www.unicode.org/versions/Unicode6.0.0/ D76 Unicode scalar value: Any Unicode code point except high-surrogate and low-surrogate code points. D79 A Unicode encoding form assigns each Unicode scalar value to a unique code unit sequence. D77 Code unit: The minimal bit combination that can represent a unit of encoded text for processing or interchange. [...] The Unicode Standard uses 8-bit code units in the UTF-8 encoding form [...] D79 A Unicode encoding form assigns each Unicode scalar value to a unique code unit sequence. D85a Minimal well-formed code unit subsequence: A well-formed Unicode code unit sequence that maps to a single Unicode scalar value. D92 UTF-8 encoding form: The Unicode encoding form that assigns each Unicode scalar value to an unsigned byte sequence of one to four bytes in length, as specified in Table 3-6 and Table 3-7. Before the Unicode Standard, Version 3.1, the problematic “non-shortest form” byte sequences in UTF-8 were those where BMP characters could be represented in more than one way. These sequences are ill-formed, because they are not allowed by Table 3-7. In short: UTF-8 defines a mapping between sequences of 8-bit code units to abstract Unicode scalar values. Every Unicode scalar value maps to a single unique sequence of code units, but all Unicode scalar values can be represented. Since U+0308 COMBINING DIAERESIS is a valid Unicode scalar value, a UTF-8 string representing that value can be well-formed. It is only alternative representations of the same Unicode scalar value which must be in shortest form. There may be standards for interchange in particular situations which enforce additional constraints, such as that all strings should be in NFC, but the applicability or correct implementation of such standards is not something that you can use to define handling in an entire programming language. That aside. I think requirements is what i was asking about, i'm assuming that your standpoint is that string modification routines are at least required to take into account entire characters, not only code points. Am i correct? Yes, I think that at least some functions should be available which work on characters as users would define them, such as length and perhaps safe truncation. What is confusing me is that i think you're seeing it as a major implementation defect. To avoid arguable implementations, i've made short example in Java: System.out.println(new StringBuffer(noël).reverse().toString()); It does produce string l̈eon as i would expect. Why do you expect that? Is this a result which would ever be useful? To be clear, I am suggesting that we aim to be the language which gets this right, where other languages get it wrong. Precomposed noël also works as i would expect producing string lëon. What do you think, is this implementation issue or solely requirements issue? Well, you can only define an implementation defect with respect to the original requirement. If the requirement was to reverse characters, as most users would understand that term, then moving the diacritic to a different letter fails that requirement, because a user would not consider a diacritic a separate character. If the requirement was to reverse code points, regardless of their meaning, then the implementation is fine, but I would argue that the requirement failed to capture what most users would actually want. Regards, -- Rowan Collins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] New globals for PUT and DELETE
2014-10-15 4:30 GMT+03:00 Stas Malyshev smalys...@sugarcrm.com: Hi! PHP today to enable successful easy implementation of RESTful interfaces. Having done this, I beg to differ. Try to send a parameter in the body, by PUT method, for reading parameters have to use an ugly way file_get_contents(‘php://input') What exactly is the problem in this one-liner? But yet worst, you can not do upload files and send parameters because - php://input is not available with enctype=multipart/form-data I'm not sure I understand what you're trying to do, could you explain in more detail with examples? PUT /url Content-type: application/x-www-form-urlencoded parse_str (file_get_contents(‘php://input'), $_POST) // Ok PUT /url Content-type: multipart/mixed; boundary= file_get_contents(‘php://input') // Empty string -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Unicode support
On 15/10/14 15:58, Rowan Collins wrote: Rowan, What is confusing me is that i think you're seeing it as a major implementation defect. To avoid arguable implementations, i've made short example in Java: System.out.println(new StringBuffer(noël).reverse().toString()); It does produce string l̈eon as i would expect. Why do you expect that? Is this a result which would ever be useful? I think expect it to work this way because i know that this is a good trade-off between performance and produced result. It also leaves a possibility to do it better if i need to. To be clear, I am suggesting that we aim to be the language which gets this right, where other languages get it wrong. Thank you for explaining this. I also think it could do better. I think Unicode-aware strrev() shouldn't be too complicated to do. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] New globals for PUT and DELETE
I'm not sure I understand what you're trying to do, could you explain in more detail with examples? PUT /url Content-type: application/x-www-form-urlencoded parse_str (file_get_contents(‘php://input'), $_POST) // Ok PUT /url Content-type: multipart/mixed; boundary= file_get_contents(‘php://input') // Empty string You are missing information in your example. First, PHP doesn't do content negotiation for every Content-Type known to send serialized data, it is extremely selective in that AFAIK, url/form encoded Content-Types, when POSTed, will be content-negotiated, parsed, and their resultant will populate the $_POST superglobal. Moreover, your 2nd example does not suggest that an HTTP request was sent with a body to demonstrate there is a bug here. With a php script being served at index.php with the following contents: ?php var_dump(file_get_contents('php://input')); Here is a demonstration of this script, and a successful read of the request body: $ echo 'F=BAR' | http --verbose PUT localhost:8000 \ Content-Type:'multipart/mime; boundry=x' PUT / HTTP/1.1 Accept: application/json Accept-Encoding: gzip, deflate Content-Length: 14 Content-Type: multipart/mime; boundry=x Host: localhost:8000 User-Agent: HTTPie/0.8.0 F=BAR HTTP/1.1 200 OK Connection: close Content-type: text/html Host: localhost:8000 X-Powered-By: PHP/5.5.15 string(14) F=BAR -ralph -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] New globals for PUT and DELETE
On 15 October 2014 22:14:32 GMT+01:00, Ralph Schindler ra...@ralphschindler.com wrote: I'm not sure I understand what you're trying to do, could you explain in more detail with examples? PUT /url Content-type: application/x-www-form-urlencoded parse_str (file_get_contents(‘php://input'), $_POST) // Ok PUT /url Content-type: multipart/mixed; boundary= file_get_contents(‘php://input') // Empty string Here is a demonstration of this script, and a successful read of the request body: $ echo 'F=BAR' | http --verbose PUT localhost:8000 \ Content-Type:'multipart/mime; boundry=x' PUT / HTTP/1.1 Accept: application/json Accept-Encoding: gzip, deflate Content-Length: 14 Content-Type: multipart/mime; boundry=x Host: localhost:8000 User-Agent: HTTPie/0.8.0 I'm not sure if itmakes a difference, but you mistyped the content type there: it should be `multipart/mixed`, not `multipart/mime`. There may also be version differences at play here, because I think the behaviour of php://input has been changed a couple of times. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] New globals for PUT and DELETE
Hi! ?php if($_SERVER['REQUEST_METHOD'] == 'POST') { var_dump(file_get_contents('php://input')); exit; } I tried this script, if you do POST, your data is in $_FILES, if you do PUT, your data is in php://input. Still not sure what is the big problem. -- Stanislav Malyshev, Software Architect SugarCRM: http://www.sugarcrm.com/ -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] New globals for PUT and DELETE
2014-10-16 2:13 GMT+03:00 Stas Malyshev smalys...@sugarcrm.com: I tried this script, if you do POST, your data is in $_FILES, if you do PUT, your data is in php://input. Still not sure what is the big problem. I added the variable field, how do I get its value, with use the query method PUT and enctype=multipart/form-data? This debate not for tediousness, this is a real problem, if you want to use the query method PUT and enctype=multipart/form-data, variable $_POST is empty and file_get_contents('php://input') is empty ?php if($_SERVER['REQUEST_METHOD'] == 'POST') { var_dump(file_get_contents('php://input')); exit; } ?html body form method=POST enctype=multipart/form-data input type=hidden name=key value=value input type=file name=file hr buttonPOST/button /form /body /html -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] New globals for PUT and DELETE
Hi! I added the variable field, how do I get its value, with use the query method PUT and enctype=multipart/form-data? This debate not for tediousness, this is a real problem, if you want to use the query method PUT and enctype=multipart/form-data, variable $_POST is empty and file_get_contents('php://input') is empty No, file_get_contents('php://input') is not empty - I just checked it and if you send PUT request the whole request - files and all - is in the php://input. If you don't see it you might be doing something wrong. You are talking about PUT, but post example about POST for the second time. I'm not sure what you mean by that as certainly your script does not demonstrate what you are saying. -- Stanislav Malyshev, Software Architect SugarCRM: http://www.sugarcrm.com/ -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] New globals for PUT and DELETE
2014-10-16 2:36 GMT+03:00 Stas Malyshev smalys...@sugarcrm.com: No, file_get_contents('php://input') is not empty - I just checked it and if you send PUT request the whole request - files and all - is in the php://input. If you don't see it you might be doing something wrong. Yes, you're right, I'm wrong tests, send PUT enctype=multipart/form-data - php://input - is not empty, sorry. But parse multi part in the PHP script is not very good solution, can use pecl-ext HTTP, but it is better if there are an global array of $_BODY. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] RFC: PHP 7.0 timeline
hi Zeev, On Tue, Oct 14, 2014 at 10:08 AM, Zeev Suraski z...@zend.com wrote: All, We’ve had some discussions about it during the version name phpng RFC processes, and now that 5.6.0 is behind us – I think it’s time to get a more concrete game plan for PHP 7.0. I drafted an RFC that proposes a one year timeline for PHP 7.0. I believe it strikes a good balance between early delivery to stay competitive, and having enough time to shape a major version. Given that we’ve already made some very substantial progress towards 7.0 (with phpng, AST, uniform variable syntax, etc.) – I think that timeline is very realistic, and perhaps we can even beat it. Restating the obvious, new features that don’t have compatibility implications can always be delivered in minor versions, i.e. 7.1, 7.2, etc. The RFC is at https://wiki.php.net/rfc/php7timeline - comments welcome! So I have to say (again) that one year is too short. We discussed 1.5 year as a good realistic timeline and that's what I am going to propose using a competitve RFC today, unless you are willing to add the options to this RFC (would be easier). The need of a 5.7 release is also something we need to decide. I am not saying we cannot make it in less than 1.5 year, but one year, given the current status, is not realistic. Given that the engine is still a moving target, it is not realistic, fair nor correct to ask other developers to get their stuff done by 2015/1 or 2015/2, it is simply not the way I see cooperation and support for other php.net developers. Cheers, -- Pierre @pierrejoye | http://www.libgd.org -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DEV] RFC: Return Types Update
Dear Internals, I finally have a working implementation for return types RFC[1] built on top of master. There are a few notes in the PR[2] about the implementation. I invite you all to review the PR and provide feedback. This means that I will soon move the return types RFC to voting phase. If you have not yet had time to review the RFC recently I invite you to do so now. The RFC has been slightly altered since the last discussion: - The RFC now targets PHP 7 (previously PHP 5.7). - There is a new section about disallowing return types on certain methods[4]. - The design and accompanying section of reflection[3] has been rewritten entirely. Regardless of the result of the RFC, I want to thank the many people who have been helpful to me as I have learned php-src and iterated over this RFC. [1]: https://wiki.php.net/rfc/returntypehinting [2]: https://github.com/php/php-src/pull/820 [3]: https://wiki.php.net/rfc/returntypehinting#reflection [4]: https://wiki.php.net/rfc/returntypehinting#methods_which_cannot_declare_return_types -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php