Re: [PHP] RSS Feed Accented Characters
www.TheVerseOfTheDay.info -Original Message- From: Richard Quadling Sent: Friday, September 30, 2011 2:53 PM To: Ron Piggott Cc: php-general@lists.php.net Subject: Re: [PHP] RSS Feed Accented Characters On 30 September 2011 18:22, Ron Piggott wrote: -Original Message- From: Richard Quadling Sent: Friday, September 30, 2011 12:31 PM To: Ron Piggott Cc: php-general@lists.php.net Subject: Re: [PHP] RSS Feed Accented Characters On 30 September 2011 17:26, Ron Piggott wrote: I am trying to set up an RSS Feed in the Spanish language using a PHP cron job. I am unsure of how to deal with accented letters. An example: This syntax: " . htmlentities("El Versículo del Día") . "\r\n"; ?> Outputs: El Versículo del Día When I use an RSS Feed validator I receive the error message This feed does not validate. a.. line 24, column 20: XML parsing error: :24:20: undefined entity I suspect the “;” is the issue, although it is needed for the accented letters. If I don’t use htmlentities() the accented characters can’t be viewed, they become a “?” How should I proceed? Ron Make sure you have ... as the first line of the output. That tells the reader that the file is a UTF-8 encoded file. Also, if you ejecting HTTP headers, make sure that they say the encoding is UTF-8 and not a codepage. Go UTF-8 everywhere. -- Richard Quadling Twitter : EE : Zend : PHPDoc @RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY : bit.ly/lFnVea Hi Richard: Having " " as the starting line didn't correct the problem. The RSS Feed is @ http://www.elversiculodeldia.info/peticiones-de-rezo-rss.xml There are a variety of errors related to accented characters while using a feed valuator http://validator.w3.org/feed/check.cgi?url=http%3A%2F%2Fwww.elversiculodeldia.info%2Fpeticiones-de-rezo-rss.xml - Also While viewing the feed in Firefox once the first accented character is displayed none of the rest of the feed is visible, except by right clicking and "view source" The RSS Feed content will be populated by a database query. The database columns are set to utf8_unicode_ci How should I proceed? Ron The byte sequence that is being received is just 0xED. php -r "file_put_contents('a.rss', file_get_contents('http://www.elversiculodeldia.info/peticiones-de-rezo-rss.xml'));" This is NOT UTF-8 encoded data, but is ISO-8859-1 Latin-1 (most likely). So as I see it you have 1 choice. Either use as the XML tag or convert the encoded data to UTF-8. It also means that the data in the sql server is NOT UTF-8 and will need to be converted also. I would recommend doing that first. That will mean reading the data as ISO-8859-1 and converting it to UTF-8 and then saving it again. I'd also be looking at the app that inputs the data into the DB initially. To convert the text, here are 2 examples. I'm sure there are more ways. $iso_text = 'El Versículo del Día: Pray For Others: Incoming Prayer Requests'; $utf_8_text = utf8_encode($iso_text); var_dump($iso_text, $utf_8_text); $utf_8_text = iconv('ISO-8859-1', 'UTF-8', $iso_text); var_dump($iso_text, $utf_8_text); ?> outputs ... string(63) "El Vers퀀culo del D퀀a: Pray For Others: Incoming Prayer Requests" string(65) "El Versículo del Día: Pray For Others: Incoming Prayer Requests" string(63) "El Vers퀀culo del D퀀a: Pray For Others: Incoming Prayer Requests" string(65) "El Versículo del Día: Pray For Others: Incoming Prayer Requests" notice that the correct strings are 2 bytes longer? The í is encoded as 0xC3AD or U+00ED. -- Richard Quadling Twitter : EE : Zend : PHPDoc @RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY : bit.ly/lFnVea Richard I was unaware of the utf8_encode command. Thank you very much --- this now works. Now I may continue with the translation into Spanish. Ron -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] RSS Feed Accented Characters
On 30 September 2011 18:22, Ron Piggott wrote: > > -Original Message- From: Richard Quadling > Sent: Friday, September 30, 2011 12:31 PM > To: Ron Piggott > Cc: php-general@lists.php.net > Subject: Re: [PHP] RSS Feed Accented Characters > > On 30 September 2011 17:26, Ron Piggott wrote: >> >> I am trying to set up an RSS Feed in the Spanish language using a PHP cron >> job. I am unsure of how to deal with accented letters. >> >> An example: >> >> This syntax: >> >> > >> $rss_content .= "" . htmlentities("El Versículo del Día") . >> "\r\n"; >> >> ?> >> >> Outputs: >> >> >> El Versículo del Día >> >> >> When I use an RSS Feed validator I receive the error message >> >> This feed does not validate. >> >> a.. line 24, column 20: XML parsing error: :24:20: undefined >> entity >> >> I suspect the “;” is the issue, although it is needed for the accented >> letters. If I don’t use htmlentities() the accented characters can’t be >> viewed, they become a “?” How should I proceed? >> >> Ron > > Make sure you have ... > > > > as the first line of the output. That tells the reader that the file > is a UTF-8 encoded file. Also, if you ejecting HTTP headers, make sure > that they say the encoding is UTF-8 and not a codepage. > > Go UTF-8 everywhere. > > > -- > Richard Quadling > Twitter : EE : Zend : PHPDoc > @RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY : bit.ly/lFnVea > > > > > Hi Richard: > > Having " " as the starting > line didn't correct the problem. > > The RSS Feed is @ > http://www.elversiculodeldia.info/peticiones-de-rezo-rss.xml > > There are a variety of errors related to accented characters while using a > feed valuator > http://validator.w3.org/feed/check.cgi?url=http%3A%2F%2Fwww.elversiculodeldia.info%2Fpeticiones-de-rezo-rss.xml > > - Also While viewing the feed in Firefox once the first accented character > is displayed none of the rest of the feed is visible, except by right > clicking and "view source" > > The RSS Feed content will be populated by a database query. The database > columns are set to utf8_unicode_ci > > How should I proceed? > Ron > The byte sequence that is being received is just 0xED. php -r "file_put_contents('a.rss', file_get_contents('http://www.elversiculodeldia.info/peticiones-de-rezo-rss.xml'));" This is NOT UTF-8 encoded data, but is ISO-8859-1 Latin-1 (most likely). So as I see it you have 1 choice. Either use as the XML tag or convert the encoded data to UTF-8. It also means that the data in the sql server is NOT UTF-8 and will need to be converted also. I would recommend doing that first. That will mean reading the data as ISO-8859-1 and converting it to UTF-8 and then saving it again. I'd also be looking at the app that inputs the data into the DB initially. To convert the text, here are 2 examples. I'm sure there are more ways. outputs ... string(63) "El Vers퀀culo del D퀀a: Pray For Others: Incoming Prayer Requests" string(65) "El Versículo del Día: Pray For Others: Incoming Prayer Requests" string(63) "El Vers퀀culo del D퀀a: Pray For Others: Incoming Prayer Requests" string(65) "El Versículo del Día: Pray For Others: Incoming Prayer Requests" notice that the correct strings are 2 bytes longer? The í is encoded as 0xC3AD or U+00ED. -- Richard Quadling Twitter : EE : Zend : PHPDoc @RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY : bit.ly/lFnVea -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] RSS Feed Accented Characters
-Original Message- From: Richard Quadling Sent: Friday, September 30, 2011 12:31 PM To: Ron Piggott Cc: php-general@lists.php.net Subject: Re: [PHP] RSS Feed Accented Characters On 30 September 2011 17:26, Ron Piggott wrote: I am trying to set up an RSS Feed in the Spanish language using a PHP cron job. I am unsure of how to deal with accented letters. An example: This syntax: $rss_content .= "" . htmlentities("El Versículo del Día") . "\r\n"; ?> Outputs: El Versículo del Día When I use an RSS Feed validator I receive the error message This feed does not validate. a.. line 24, column 20: XML parsing error: :24:20: undefined entity I suspect the “;” is the issue, although it is needed for the accented letters. If I don’t use htmlentities() the accented characters can’t be viewed, they become a “?” How should I proceed? Ron Make sure you have ... as the first line of the output. That tells the reader that the file is a UTF-8 encoded file. Also, if you ejecting HTTP headers, make sure that they say the encoding is UTF-8 and not a codepage. Go UTF-8 everywhere. -- Richard Quadling Twitter : EE : Zend : PHPDoc @RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY : bit.ly/lFnVea Hi Richard: Having " " as the starting line didn't correct the problem. The RSS Feed is @ http://www.elversiculodeldia.info/peticiones-de-rezo-rss.xml There are a variety of errors related to accented characters while using a feed valuator http://validator.w3.org/feed/check.cgi?url=http%3A%2F%2Fwww.elversiculodeldia.info%2Fpeticiones-de-rezo-rss.xml - Also While viewing the feed in Firefox once the first accented character is displayed none of the rest of the feed is visible, except by right clicking and "view source" The RSS Feed content will be populated by a database query. The database columns are set to utf8_unicode_ci How should I proceed? Ron -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] RSS Feed Accented Characters
Whoops! Forgive my try at it :) -Original Message- From: Richard Quadling [mailto:rquadl...@gmail.com] Sent: Friday, September 30, 2011 11:47 AM To: j...@cetaceasound.com Cc: Ron Piggott; php-general@lists.php.net Subject: Re: [PHP] RSS Feed Accented Characters On 30 September 2011 17:41, Jen Rasmussen wrote: > Would this work? > > $content = "El Versículo del Día"; > $rss_content .= "" . $content . "\r\n"; > > Cheers! > Jen The entities are HTML entities. They are not XML entities. If they are displayed as ? then it is an encoding issue. What encoding are you using? -- Richard Quadling Twitter : EE : Zend : PHPDoc @RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY : bit.ly/lFnVea -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] RSS Feed Accented Characters
On 30 September 2011 17:41, Jen Rasmussen wrote: > Would this work? > > $content = "El Versículo del Día"; > $rss_content .= "" . $content . "\r\n"; > > Cheers! > Jen The entities are HTML entities. They are not XML entities. If they are displayed as ? then it is an encoding issue. What encoding are you using? -- Richard Quadling Twitter : EE : Zend : PHPDoc @RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY : bit.ly/lFnVea -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] RSS Feed Accented Characters
Would this work? $content = "El Versículo del Día"; $rss_content .= "" . $content . "\r\n"; Cheers! Jen -Original Message- From: Ron Piggott [mailto:ron@actsministries.org] Sent: Friday, September 30, 2011 11:26 AM To: php-general@lists.php.net Subject: [PHP] RSS Feed Accented Characters I am trying to set up an RSS Feed in the Spanish language using a PHP cron job. I am unsure of how to deal with accented letters. An example: This syntax: " . htmlentities("El Versículo del Día") . "\r\n"; ?> Outputs: El Versículo del Día When I use an RSS Feed validator I receive the error message This feed does not validate. a.. line 24, column 20: XML parsing error: :24:20: undefined entity I suspect the “;” is the issue, although it is needed for the accented letters. If I don’t use htmlentities() the accented characters can’t be viewed, they become a “?” How should I proceed? Ron www.TheVerseOfTheDay.info -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] RSS Feed Accented Characters
On 30 September 2011 17:26, Ron Piggott wrote: > > I am trying to set up an RSS Feed in the Spanish language using a PHP cron > job. I am unsure of how to deal with accented letters. > > An example: > > This syntax: > > > $rss_content .= "" . htmlentities("El Versículo del Día") . > "\r\n"; > > ?> > > Outputs: > > > El Versículo del Día > > > When I use an RSS Feed validator I receive the error message > > This feed does not validate. > > a.. line 24, column 20: XML parsing error: :24:20: undefined entity > > I suspect the “;” is the issue, although it is needed for the accented > letters. If I don’t use htmlentities() the accented characters can’t be > viewed, they become a “?” How should I proceed? > > Ron Make sure you have ... as the first line of the output. That tells the reader that the file is a UTF-8 encoded file. Also, if you ejecting HTTP headers, make sure that they say the encoding is UTF-8 and not a codepage. Go UTF-8 everywhere. -- Richard Quadling Twitter : EE : Zend : PHPDoc @RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY : bit.ly/lFnVea -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php