Hi!
I'm not completely against it. It's just an incomplete solution.
echo \u{1F602}; // won't output if the output encoding is not UTF-8
You can always use iconv/recode to bring it to every encoding you need
(provided it supports full unicode range). I see this as a readability
feature -
May be I misunderstood something, but why to introduce unicode escapes if
PHP engine doesn't support Unicode.
Always converting such escapes into UTF-8 encoding, doesn't make any sense
for people who use other encodings for output, databases, etc.
Thanks. Dmitry.
On Tue, Nov 25, 2014 at 1:09
On 24.11.14 23:09, Andrea Faulds wrote:
Good evening,
Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
I think the choice of \u{xx} is interesting, i.e. using '{' and '}'.
Afaik, one of the current best practices is to use json_decode(), like so:
$ cat test.php
?php var_dump(
On 25 Nov 2014, at 08:33, Dmitry Stogov dmi...@zend.com wrote:
May be I misunderstood something, but why to introduce unicode escapes if PHP
engine doesn't support Unicode.
We don't have Unicode strings which are made of codepoints rather than bytes,
sure. But we do usually treat these
On 25 Nov 2014, at 08:33, Markus Fischer mar...@fischer.name wrote:
On 24.11.14 23:09, Andrea Faulds wrote:
Good evening,
Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
I think the choice of \u{xx} is interesting, i.e. using '{' and '}'.
Afaik, one of the current best
On Mon, 24 Nov 2014, Sara Golemon wrote:
On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds a...@ajf.me wrote:
Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
I'm okay with producing UTF-8 even though our strings are technically
binary. As you state, UTF-8 is the de-facto encoding,
On Tue, Nov 25, 2014 at 1:00 PM, Andrea Faulds a...@ajf.me wrote:
On 25 Nov 2014, at 08:33, Dmitry Stogov dmi...@zend.com wrote:
May be I misunderstood something, but why to introduce unicode escapes
if PHP engine doesn't support Unicode.
We don't have Unicode strings which are made of
On 25 Nov 2014, at 10:32, Derick Rethans der...@php.net wrote:
On Mon, 24 Nov 2014, Sara Golemon wrote:
On the BMP versus SMP issue of \u styles, we addressed this in
PHP6 by making \u denote 4 hexit BMP codepoints, while \U denoted six
hexit codepoints. e.g.\u1234 === \U001234
On 25 Nov 2014, at 10:41, Dmitry Stogov dmi...@zend.com wrote:
u8string tells that the whole string is UTF-8 encoded.
Your escape Unicode proposal assumes just UTF-8 codepoint, but the whole
string encoding is still undefined.
True. There’s an assumption there that you’re using a
On Tue, Nov 25, 2014 at 02:41:48PM +0400, Dmitry Stogov wrote:
I'm not completely against it. It's just an incomplete solution.
echo \u{1F602}; // won't output if the output encoding is not UTF-8
echo Привет \u{1F602}; // won't output anything useful if script
encoding is not UTF-8
On 25 Nov 2014, at 11:20, Alain Williams a...@phcomp.co.uk wrote:
I think that we need to clarify what we are talking about.
What Andrea has proposed is a way of writing string constants. These
characters
in these strings will still be 8 bits big, this means that there needs to be
some
Ivan Enderlin @ Hoa wrote:
Le 24/11/2014 23:09, Andrea Faulds a écrit :
Good evening,
Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
It has a rationale section explaining why certain decisions were made,
that I’d recommend you read in full.
Excellent RFC, thank you for this
On Tue, Nov 25, 2014 at 11:25:17AM +, Andrea Faulds wrote:
Well, we *do* already have a compile-time system for declaring encoding, the
declare() construct.
I missed that. Reading the documentation I confess that I do not really
understand what the effect of declare(encoding=xxx) does.
On Tue, 25 Nov 2014, Dmitry Stogov wrote:
On Tue, Nov 25, 2014 at 1:00 PM, Andrea Faulds a...@ajf.me wrote:
On 25 Nov 2014, at 08:33, Dmitry Stogov dmi...@zend.com wrote:
May be I misunderstood something, but why to introduce unicode escapes
if PHP engine doesn't support Unicode.
Hi all,
On Tue, Nov 25, 2014 at 8:09 PM, Andrea Faulds a...@ajf.me wrote:
non-BMP code points are more important than ever.
Yes, it is! We(Japanese) have number of them already.
\u{code point} has huge advantage. We do not have care if code point value
is BMP or not.
i.e. We can do
echo
On 25 Nov 2014, at 11:48, Derick Rethans der...@php.net wrote:
I think incomplete nails it on the head. Without proper Unicode
support in the parser, compiler and string function semantics, having
these escape codes doesn't really do a lot for us.
How so? Why are they less useful because
On Tue, Nov 25, 2014 at 2:18 PM, Andrea Faulds a...@ajf.me wrote:
On 25 Nov 2014, at 10:41, Dmitry Stogov dmi...@zend.com wrote:
u8string tells that the whole string is UTF-8 encoded.
Your escape Unicode proposal assumes just UTF-8 codepoint, but the
whole string encoding is still
On Tue, Nov 25, 2014 at 3:20 AM, Alain Williams a...@phcomp.co.uk wrote:
If we decide to support non-utf-8 encoding at compile time then we could
extend
the syntax a bit to allow the encoding to be specified, eg:
\U{utf-8: arabic letter alef}
\U{iso-8859-6: arabic letter alef}
Good evening,
Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
It has a rationale section explaining why certain decisions were made, that I’d
recommend you read in full.
Thanks!
--
Andrea Faulds
http://ajf.me/
--
PHP Internals - PHP Runtime Development Mailing List
To
On 24 Nov 2014, at 22:09, Andrea Faulds a...@ajf.me wrote:
Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
My apologies to you all, a small correction: The title of that email should’ve
been “[RFC] Unicode Codepoint Escape Syntax” to match the title of the RFC, I
missed out the
On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds a...@ajf.me wrote:
Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
I'm okay with producing UTF-8 even though our strings are technically
binary. As you state, UTF-8 is the de-facto encoding, and recognizing
this is pretty reasonable.
You
On 24 Nov 2014, at 22:21, Sara Golemon poll...@php.net wrote:
On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds a...@ajf.me wrote:
Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
I'm okay with producing UTF-8 even though our strings are technically
binary. As you state, UTF-8 is
On 24 November 2014 at 14:21, Sara Golemon poll...@php.net wrote:
On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds a...@ajf.me wrote:
Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
I'm okay with producing UTF-8 even though our strings are technically
binary. As you state, UTF-8 is
On 24 Nov 2014, at 22:30, Adam Harvey ahar...@php.net wrote:
On 24 November 2014 at 14:21, Sara Golemon poll...@php.net wrote:
On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds a...@ajf.me wrote:
Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
I'm okay with producing UTF-8 even
On 24 November 2014 at 14:35, Andrea Faulds a...@ajf.me wrote:
On 24 Nov 2014, at 22:30, Adam Harvey ahar...@php.net wrote:
I'm also OK with this, although I do wonder if we should be respecting
the user's default_charset setting instead. (Since default_charset
defaults to UTF-8, in practice
We would have to require ICU, but that might be worthwhile for PHP 7
anyway. Having at least one i18n API that's guaranteed to be available
would be nice.
It's 2014. I think requiring ICU is reasonable at this point.
Orthogonal to this RFC, but I'd be in favor of deprecating all the
non-ICU
On 24 Nov 2014, at 23:19, Sara Golemon poll...@php.net wrote:
We would have to require ICU, but that might be worthwhile for PHP 7
anyway. Having at least one i18n API that's guaranteed to be available
would be nice.
It's 2014. I think requiring ICU is reasonable at this point.
I also
On Mon, Nov 24, 2014 at 02:21:37PM -0800, Sara Golemon wrote:
On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds a...@ajf.me wrote:
Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
I'm okay with producing UTF-8 even though our strings are technically
binary. As you state, UTF-8 is the
On 24 Nov 2014, at 23:29, Alain Williams a...@phcomp.co.uk wrote:
There is a big difference with \u or \U and \x or \o and that is the number of
characters that follow the escape. \x has 2, \o has 3 - both are short and
easy
to count with the eye. \U012345 is quite long and it is not so
On Mon, Nov 24, 2014 at 11:36:28PM +, Andrea Faulds wrote:
On 24 Nov 2014, at 23:29, Alain Williams a...@phcomp.co.uk wrote:
echo \U{arabic letter alef}\n”;
Ooh, that’s an interesting idea. I believe Perl actually has this already,
although it uses the \N syntax:
On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds a...@ajf.me wrote:
Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
I've linked a provisional HHVM implementation from that page.
Planning to match whatever PHP7 does, of course, but for the moment
I've added named entity support since it's
Le 24/11/2014 23:09, Andrea Faulds a écrit :
Good evening,
Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
It has a rationale section explaining why certain decisions were made, that I’d
recommend you read in full.
Excellent RFC, thank you for this proposal.
I would suggest this
32 matches
Mail list logo