Re: [PHP] Re: htmlentities
On 13 September 2011 23:01, Shawn McKenzie nos...@mckenzies.net wrote: On 09/13/2011 01:38 PM, Ron Piggott wrote: Is there a way to only change accented characters and not HTML (Example: p /p a href =”” /a ) The syntax echo htmlentities( stripslashes(mysql_result($whats_new_result,0,message)) ) . \r\n; is doing everything (as I expect). I store breaking news within the database as HTML formatted text. I am trying to see if a work around is available? Do I need to do a variety of search / replace to convert the noted characters above back after htmlentities ? (I am just starting to get use to accented letters.) Thanks a lot for your help. Ron The Verse of the Day “Encouragement from God’s Word” http://www.TheVerseOfTheDay.info If it is meant to be HTML then why run htmlentities(), especially before storing it in the DB? -- Thanks! -Shawn http://www.spidean.com -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php Perhaps something like this might help you $content = htmlspecialchars_decode(htmlentities($content,ENT_NOQUOTES,ISO-8859-1),ENT_NOQUOTES); or perhaps $table_all = get_html_translation_table(HTML_ENTITIES,ENT_NOQUOTES,ISO-8859-1); $table_html = get_html_translation_table(HTML_SPECIALCHARS,ENT_NOQUOTES); $table_nonhtml = array_diff_key($table_all,$table_html); $content1 = strtr($content1,$table_nonhtml); $content2 = strtr($content2,$table_nonhtml); if using it multiple times. -- It is not possible to simultaneously understand and appreciate the Intel architecture --Ben Scott
[PHP] Re: htmlentities
On 09/13/2011 01:38 PM, Ron Piggott wrote: Is there a way to only change accented characters and not HTML (Example: p /p a href =”” /a ) The syntax echo htmlentities( stripslashes(mysql_result($whats_new_result,0,message)) ) . \r\n; is doing everything (as I expect). I store breaking news within the database as HTML formatted text. I am trying to see if a work around is available? Do I need to do a variety of search / replace to convert the noted characters above back after htmlentities ? (I am just starting to get use to accented letters.) Thanks a lot for your help. Ron The Verse of the Day “Encouragement from God’s Word” http://www.TheVerseOfTheDay.info If it is meant to be HTML then why run htmlentities(), especially before storing it in the DB? -- Thanks! -Shawn http://www.spidean.com -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: HTMLEntities as NUMERIC for XML
I already had a function to go from weird MS-Word characters to HTML Entities, which I was putting into the DB as such. In retrospect, that function should have been called at output... Actually, I knew it should have, but convincing my co-workers was the proverbial brick wall, so I cheated and did it on data import and now I'm paying for it... I ended up just copy-pasting the entity/number table from here: http://www.w3schools.com/tags/ref_entities.asp and go through a 2-step process: RAW DATA (pasted from Word in unknown code-page/charset/encoding) HTML Name Entities HTML Numeric Entities This seemed to make the W3.org RSS validator happy It still looks goofy in the browser, but that's the problem of the users putting in this goofy stuff in the first place, so I'm shoving it back into their laps. My RSS feed validates, and the content within it not being right is the problem of the content creators. :-) [soapbox on] I'm pretty tired of dealing with this charset/codepage stuff, personally, after years of frustrating experiences, none ending in a real solution If anybody has a petition to abolish everything except for UTF-32, sign me up! :-v UTF-32 is the biggest, right? The one that has ALL characters anybody needs?... Hey, I don't care, UTF-64 or UTF-128 is fine by me too. Disk space is cheap. Just stop the insanity of endless incompatible irreversible calculations to substitute a bunch of numeric codes for characters, and make it socially unacceptable to use anything other than the one true encoding. I'm sure somebody somewhere actually enjoys dealing with this [bleep], but I'm betting the majority are quite tired of it. [/soapbox] -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: HTMLEntities as NUMERIC for XML
On Tue, 2008-11-25 at 17:09 +, [EMAIL PROTECTED] wrote: I already had a function to go from weird MS-Word characters to HTML Entities, which I was putting into the DB as such. In retrospect, that function should have been called at output... Actually, I knew it should have, but convincing my co-workers was the proverbial brick wall, so I cheated and did it on data import and now I'm paying for it... I ended up just copy-pasting the entity/number table from here: http://www.w3schools.com/tags/ref_entities.asp and go through a 2-step process: RAW DATA (pasted from Word in unknown code-page/charset/encoding) HTML Name Entities HTML Numeric Entities This seemed to make the W3.org RSS validator happy It still looks goofy in the browser, but that's the problem of the users putting in this goofy stuff in the first place, so I'm shoving it back into their laps. My RSS feed validates, and the content within it not being right is the problem of the content creators. :-) [soapbox on] I'm pretty tired of dealing with this charset/codepage stuff, personally, after years of frustrating experiences, none ending in a real solution If anybody has a petition to abolish everything except for UTF-32, sign me up! :-v UTF-32 is the biggest, right? The one that has ALL characters anybody needs?... Hey, I don't care, UTF-64 or UTF-128 is fine by me too. Disk space is cheap. Just stop the insanity of endless incompatible irreversible calculations to substitute a bunch of numeric codes for characters, and make it socially unacceptable to use anything other than the one true encoding. I'm sure somebody somewhere actually enjoys dealing with this [bleep], but I'm betting the majority are quite tired of it. [/soapbox] I came across a similar problem using an AJAX thing, with MSWord characters in the text. The way round the problem was to enclose everything inside CDATA blocks, which made the browsers happy to receive as the entities only had to be understood by the HTML browser now, not the XML parser. As RSS is an XML format, maybe this would help you? Ash www.ashleysheridan.co.uk
[PHP] Re: HTMLEntities as NUMERIC for XML
[EMAIL PROTECTED] wrote: After reading this: http://validator.w3.org/feed/docs/error/UndefinedNamedEntity.html (all praise W3.org!) I am searching for a PHP library function that will convert all my abc; into #123; I have a zillion of these things from converting stupid MS Word characters into something that will, like, you know, actually WORK on the Internet, and do not really want to re-invent the wheel here. Somebody has to have written this function... I'm kind of surprised it's not http://php.net/xmlentities or somesuch... Here's what I use: //Translate table for dumb Windows chars when user paste from Word; function strips all 160 $win1252ToPlainTextArray=array( chr(130)= ',', chr(131)= '', chr(132)= ',,', chr(133)= '...', chr(134)= '+', chr(135)= '', chr(139)= '', chr(145)= '\'', chr(146)= '\'', chr(147)= '', chr(148)= '', chr(149)= '*', chr(150)= '-', chr(151)= '-', chr(155)= '', chr(160)= ' ', ); function cleanWin1252Text($str) { global $win1252ToPlainTextArray; //translate array for many dumb Windows special chars; used for paste in textarears $str = strtr($str, $win1252ToPlainTextArray); $str = trim($str); $patterns = array('%[\x7F-\x81]%', '%[\x83]%', '%[\x87-\x8A]%', '%[\x8C-\x90]%', %[\x98-\xff]%'); return preg_replace($patterns, '', $str); //Strip } -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: htmlentities()
Anthony Ritter wrote: ?php $str = A 'quote' is bbold/b; echo htmlentities($str); ? .. // outputs: A 'quote' is bbold/b Not sure why the I am still getting the tags and spaces after the call to htmlentities(). Check out the source code of the output. Maybe you want the strip_tags() function ? -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: htmlentities and foreign characters from MS Word
That did it! It seems that my version of MySQL doesn't support Unicode encoding, only the various ISO encodings. So, I guess this translation is necessary before storing all text in the DB so foreign characters aren't broken when I retrieve them from the DB. Thanks! I2eptilex wrote: Well it seems you have a UTF-8 encoded text after your function. Use iconv to change it. See http://de3.php.net/manual/en/ref.iconv.php . try doing this with your array before inserting it into the DB foreach($insert_array as $key = $var){ $new_arr[$key] = iconv(UTF-8, ISO-8859-1, $var); } It can be that your array has a different coding than UTF-8 check the manual for the htmlentities function, but i'm pretty shure that should solve it. I2eptilex Monty wrote: I'm having a problem figuring out how to deal with foreign characters in text that was copied from an MS Word document and pasted into a form field. I'm not how sure this is getting stored in the MySQL database, but, when I run htmlentities() on this text, each foreign character is converted into 2 other foreign characters that don't at all represent the original. For example, a lowercase u with an umlat over it (ü) is somehow displayed as an uppercase A with an umlat over it followed by the 1/4 symbol after parsed by htmlentities(). A lowercase o with an ulmat displays as an uppercase A with an umlat over it followed by the paragraph symbol. It seems that the uppercase A w/umlat is a constant, and the next character changes. The ord() function returns the same number for all of these foreign characters: 195. So, I'm not sure what's happening with these foreign characters, and if there's any way to convert them to proper htmlentities before being displayed in a browser. I thought htmlentities would do this, actually. Thanks! Monty. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: htmlentities and foreign characters from MS Word
You could store those texts as binary in MySQL... - Original Message - From: Monty [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Monday, September 06, 2004 11:07 AM Subject: [PHP] Re: htmlentities and foreign characters from MS Word That did it! It seems that my version of MySQL doesn't support Unicode encoding, only the various ISO encodings. So, I guess this translation is necessary before storing all text in the DB so foreign characters aren't broken when I retrieve them from the DB. Thanks! I2eptilex wrote: -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: htmlentities and foreign characters from MS Word
Well it seems you have a UTF-8 encoded text after your function. Use iconv to change it. See http://de3.php.net/manual/en/ref.iconv.php . try doing this with your array before inserting it into the DB foreach($insert_array as $key = $var){ $new_arr[$key] = iconv(UTF-8, ISO-8859-1, $var); } It can be that your array has a different coding than UTF-8 check the manual for the htmlentities function, but i'm pretty shure that should solve it. I2eptilex Monty wrote: I'm having a problem figuring out how to deal with foreign characters in text that was copied from an MS Word document and pasted into a form field. I'm not how sure this is getting stored in the MySQL database, but, when I run htmlentities() on this text, each foreign character is converted into 2 other foreign characters that don't at all represent the original. For example, a lowercase u with an umlat over it (ü) is somehow displayed as an uppercase A with an umlat over it followed by the 1/4 symbol after parsed by htmlentities(). A lowercase o with an ulmat displays as an uppercase A with an umlat over it followed by the paragraph symbol. It seems that the uppercase A w/umlat is a constant, and the next character changes. The ord() function returns the same number for all of these foreign characters: 195. So, I'm not sure what's happening with these foreign characters, and if there's any way to convert them to proper htmlentities before being displayed in a browser. I thought htmlentities would do this, actually. Thanks! Monty. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php