Edit report at https://bugs.php.net/bug.php?id=62861&edit=1

 ID:                 62861
 User updated by:    soapergem at gmail dot com
 Reported by:        soapergem at gmail dot com
 Summary:            htmlentities returns empty string when it shouldn't
 Status:             Not a bug
 Type:               Bug
 Package:            *General Issues
 Operating System:   Windows
 PHP Version:        5.4.6
 Block user comment: N
 Private report:     N

 New Comment:

I am aware that Notepad is not a suitable editor for development. It is just 
the 
de facto "basic" editor in Windows. If something doesn't work in Notepad, 
you're 
usually in trouble.

I use an editor called EditPlus, which is a very good editor. The older version 
which I have used does not have support for removing the BOM, but I see the 
newer version does, so I will have to upgrade.

But I would really appreciate it if you could address my suggestion about using 
the default_charset defined in php.ini automatically. Right now having to call 
htmlentities($string, ENT_COMPAT | ENT_HTML401, "") seems very 
counter-intuitive 
to invoke what should be the default.


Previous Comments:
------------------------------------------------------------------------
[2012-08-19 14:27:31] ras...@php.net

Every real editor can do that. Windows Notepad is not a real editor. Notepad++ 
(which is free and much much better than Notepad), Notepad2, Textmate, Vim, 
Jedit, Ultraedit, Emacs, SourceEdit can all do this.

------------------------------------------------------------------------
[2012-08-19 14:27:07] ni...@php.net

Windows Notepad does not support this because Notepad is not a suitable editor 
for development. All development-oriented texteditors and IDEs support saving 
files without BOM.

One commonly used text editor for Windows is Notepad++ (in case you don't want 
to use a full-blown IDE).

------------------------------------------------------------------------
[2012-08-19 14:11:43] soapergem at gmail dot com

There is no option to save without the BOM in Windows Notepad. Nor is there an 
option to save with/without the BOM in many other Windows editors. It is 
automatically added to the file and there is nothing I can do about that -- 
short of writing a script to programmatically go through all my other scripts 
with fopen(), remove the first three characters, and then re-save.

That is NOT a practical option. PHP should be handling this.

As it stands, PHP 5.4 is completely unusable. Until you guys fix this, I need 
to 
stick with 5.3, because 5.4 will break all of my scripts -- and all the scripts 
of ANYONE who uses htmlentities() on a Windows server. Please take my 
suggestion 
about using the default_charset to heart. That would finally resolve this issue.

------------------------------------------------------------------------
[2012-08-19 13:59:09] ni...@php.net

Save your document as UTF-8 *without* BOM. The  is just what the UTF-8 
Byte Order Mark (BOM) looks like when it is output (which is probably something 
you don't want, so save the file without it).

------------------------------------------------------------------------
[2012-08-19 13:49:39] ras...@php.net

>From my command line:

php > echo htmlentities('©', ENT_COMPAT | ENT_HTML401, 'UTF-8');
©

it works fine. If you are actually providing the correct UTF-8 char it will 
work 
fine. You can verify that by doing this:

php > $a = chr(0xC2).chr(0xA9);
php > echo htmlentities($a, ENT_COMPAT | ENT_HTML401, 'UTF-8');
©

Here I am explicitly passing C2A9 in and I get © back out.

So I have no idea what your Windows Notepad is doing. Look at the output with a 
hex editor and see what it is converting that copyright character to.

------------------------------------------------------------------------


The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at

    https://bugs.php.net/bug.php?id=62861


-- 
Edit this bug report at https://bugs.php.net/bug.php?id=62861&edit=1

Reply via email to