Hi, > From: Rui Hirokawa > > IMHO, #42396 is not a bug, but it is the specification. > The normal script doesn't contain a null byte if it is not > encoded in Unicode. > > It is understandable the addition of a unique byte seqence > '0xFFFFFFFF' detection to support PHAR/PHK, > but it is a change to add a new feature.
Sorry to insist but, since __halt_compiler() was introduced, your assertion is not true any more. Actually, it depends on what you consider as 'the script' : if you just consider the data from the beginning of the file to the __halt_compiler() directive, that's right: if this data contains a null byte, it is unicode. But the current unicode detection is not aware of the __halt_compiler() directive, and it scans the whole file. So, your assertion is wrong: it is perfectly legitimate to have a non-unicode script contain null bytes (if they are after an __halt_compiler() directive). So, it is a bug and not a feature request. This side effect was not identified when __halt_compiler() was added. The obvious solution is to decide that a non-unicode script cannot contain null bytes, even after an __halt_compiler(). It would just require three lines in the PHP doc. But that would introduce a severe limitation and, in practice, would make the __halt_compiler() feature almost useless. The solution I am proposing is not very elegant but it is the only one I found which does not make __halt_compiler() and multibyte incompatible. As __halt_compiler() was introduced recently, and as, afaict, the only software to use it are PHAR and PHK, I consider it as acceptable, if not perfect. Greg, Marcus, do you have a better idea ? I considered that unicode detection is done before __halt_compiler() can be detected, do you confirm ? Regards Francois -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php