Re: [PHP] Replacing accented characters
On Thu, Mar 4, 2010 at 12:57, Skip Evans wrote: > Hey all, > > Does anyone have a function that replaces accented characters > with the non-accent equals? This one by Sven on 21-APR-2005: "Ae", "\xC6"=>"AE", "\xD6"=>"Oe", "\xDC"=>"Ue", "\xDE"=>"TH", "\xDF"=>"ss", "\xE4"=>"ae", "\xE6"=>"ae", "\xF6"=>"oe", "\xFC"=>"ue", "\xFE"=>"th")); return($string); } ?> If you search via Google for 'php accented characters function' you'll see some user notes with code samples. I grabbed the one above from strtr() on php.net, and there are several others there and other places --- like on the preg_replace() page, if memory serves. -- daniel.br...@parasane.net || danbr...@php.net http://www.parasane.net/ || http://www.pilotpig.net/ Looking for hosting or dedicated servers? Ask me how we can fit your budget! -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Replacing accented characters
On Thu, 2010-03-04 at 11:57 -0600, Skip Evans wrote: > Hey all, > > Does anyone have a function that replaces accented characters > with the non-accent equals? > > I can't figure out how to get them out of the Ubuntu keyboard > the way it is configured so I've just tried three different > functions off the Internet and none of them have worked. > > Thanks, > Skip > -- > > Skip Evans > PenguinSites.com, LLC > 503 S Baldwin St, #1 > Madison WI 53703 > 608.250.2720 > http://penguinsites.com > > Those of you who believe in > telekinesis, raise my hand. > -- Kurt Vonnegut > Ubuntu should have gucharmap which should let you copy and paste accented characters. Thanks, Ash http://www.ashleysheridan.co.uk
Re: [PHP] Replacing accented characters?
Robert Cummings wrote: Paul M Foster wrote: On Thu, Jan 28, 2010 at 02:38:52PM -0500, tedd wrote: My point was more to the theme that we are an eclectic group of people with a wide range of knowledge and skills. Individually we may have trouble finding our ass, but together we can find the answer to many things. I just got this image of all of us at a table at the local Chili's twirling around in place like dogs, trying to find our asses. I wish you'd stop grabbing ming!! :B Mine even... doh! -- http://www.interjinn.com Application and Templating Framework for PHP -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Replacing accented characters?
Paul M Foster wrote: On Thu, Jan 28, 2010 at 02:38:52PM -0500, tedd wrote: My point was more to the theme that we are an eclectic group of people with a wide range of knowledge and skills. Individually we may have trouble finding our ass, but together we can find the answer to many things. I just got this image of all of us at a table at the local Chili's twirling around in place like dogs, trying to find our asses. I wish you'd stop grabbing ming!! :B Cheers, Rob. -- http://www.interjinn.com Application and Templating Framework for PHP -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Replacing accented characters?
On Thu, Jan 28, 2010 at 02:38:52PM -0500, tedd wrote: > My point was more to the theme that we are an eclectic group of people > with a wide range of knowledge and skills. Individually we may have > trouble finding our ass, but together we can find the answer to many > things. I just got this image of all of us at a table at the local Chili's twirling around in place like dogs, trying to find our asses. Paul -- Paul M. Foster -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Replacing accented characters?
At 9:28 AM -0500 1/28/10, Robert Cummings wrote: tedd wrote: At 12:17 PM +0100 1/28/10, Marcus Gnaß wrote: On 28.01.2010 03:40, Paul M Foster wrote: On Wed, Jan 27, 2010 at 04:55:46PM -0600, Skip Evans wrote: Hey all, I'm looking for recommendations on how to replace accented characters, like e and u with those two little dots above them, with the regular e and u characters. FWIW, those two dots are called an "umlaut". Paul FWIW, the whole letters ÄäÖöÜü are called "Umlaute" (umlauts). The two dots above *these* letters are "Umlautzeichen" (umlaut marks). But two dots above an e or i are called "Trema" (diacritic mark). Marcus On what other list could we learn that? A linguistics list I'd wager. :B Cheers, Rob. And let's not forget the Trema list. :-) My point was more to the theme that we are an eclectic group of people with a wide range of knowledge and skills. Individually we may have trouble finding our ass, but together we can find the answer to many things. Cheers, tedd -- --- http://sperling.com http://ancientstones.com http://earthstones.com -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Replacing accented characters?
tedd wrote: At 12:17 PM +0100 1/28/10, Marcus Gnaß wrote: On 28.01.2010 03:40, Paul M Foster wrote: On Wed, Jan 27, 2010 at 04:55:46PM -0600, Skip Evans wrote: Hey all, I'm looking for recommendations on how to replace accented characters, like e and u with those two little dots above them, with the regular e and u characters. FWIW, those two dots are called an "umlaut". Paul FWIW, the whole letters ÄäÖöÜü are called "Umlaute" (umlauts). The two dots above *these* letters are "Umlautzeichen" (umlaut marks). But two dots above an e or i are called "Trema" (diacritic mark). Marcus On what other list could we learn that? A linguistics list I'd wager. :B Cheers, Rob. -- http://www.interjinn.com Application and Templating Framework for PHP -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Replacing accented characters?
At 12:17 PM +0100 1/28/10, Marcus Gnaß wrote: On 28.01.2010 03:40, Paul M Foster wrote: On Wed, Jan 27, 2010 at 04:55:46PM -0600, Skip Evans wrote: Hey all, I'm looking for recommendations on how to replace accented characters, like e and u with those two little dots above them, with the regular e and u characters. FWIW, those two dots are called an "umlaut". Paul FWIW, the whole letters ÄäÖöÜü are called "Umlaute" (umlauts). The two dots above *these* letters are "Umlautzeichen" (umlaut marks). But two dots above an e or i are called "Trema" (diacritic mark). Marcus On what other list could we learn that? Thanks, tedd -- --- http://sperling.com http://ancientstones.com http://earthstones.com -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Replacing accented characters?
On 28.01.2010 03:40, Paul M Foster wrote: > On Wed, Jan 27, 2010 at 04:55:46PM -0600, Skip Evans wrote: > >> Hey all, >> >> I'm looking for recommendations on how to replace accented >> characters, like e and u with those two little dots above >> them, with the regular e and u characters. > > FWIW, those two dots are called an "umlaut". > > Paul > FWIW, the whole letters ÄäÖöÜü are called "Umlaute" (umlauts). The two dots above *these* letters are "Umlautzeichen" (umlaut marks). But two dots above an e or i are called "Trema" (diacritic mark). Marcus -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Replacing accented characters?
On Wed, Jan 27, 2010 at 04:55:46PM -0600, Skip Evans wrote: > Hey all, > > I'm looking for recommendations on how to replace accented > characters, like e and u with those two little dots above > them, with the regular e and u characters. FWIW, those two dots are called an "umlaut". Paul -- Paul M. Foster -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Replacing accented characters?
Looks like strtr() is the way to go? Skip Skip Evans wrote: Hey all, I'm looking for recommendations on how to replace accented characters, like e and u with those two little dots above them, with the regular e and u characters. I'm finding some solutions via Google, but would like to hear from some of you to hear how you handle those situations. Thanks, Skip -- Skip Evans PenguinSites.com, LLC 503 S Baldwin St, #1 Madison WI 53703 608.250.2720 http://penguinsites.com Those of you who believe in telekinesis, raise my hand. -- Kurt Vonnegut -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Replacing accented characters by non-accented characters
2008/5/14 Bastien Koert <[EMAIL PROTECTED]>: > Why should the server folder name matter? Make it a hash and store the user > provided name in a db. Then when presenting the data to the user just show > the user provided name as the folder name. This would also handle multiple > users trying to use the same folder name for their stuff. > It may be that the whole folder hierarchy is publicly accesable. Some faculties at my university do that. Dotan Cohen http://what-is-what.com http://gibberish.co.il א-ב-ג-ד-ה-ו-ז-ח-ט-י-ך-כ-ל-ם-מ-ן-נ-ס-ע-ף-פ-ץ-צ-ק-ר-ש-ת A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
Re: [PHP] Replacing accented characters by non-accented characters
On Tue, May 13, 2008 at 8:39 AM, Per Jessen <[EMAIL PROTECTED]> wrote: > Yannick Warnier wrote: > > > That would probably work out if it wasn't too dependent on the locales > > to work. I'm developing an open-source product which could end up on a > > server without the locales for French but be used by some French > > people, which would make (as far as I can get out of one comment from > > Richie in the PHP manual) the transliteration somewhat wrong. > > With the kind of rough conversion/transformation you're doing, is the > locale really very important anyway? > > > /Per Jessen, Zürich > > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > > Why should the server folder name matter? Make it a hash and store the user provided name in a db. Then when presenting the data to the user just show the user provided name as the folder name. This would also handle multiple users trying to use the same folder name for their stuff. -- Bastien Cat, the other other white meat
Re: [PHP] Replacing accented characters by non-accented characters
Yannick Warnier wrote: > That would probably work out if it wasn't too dependent on the locales > to work. I'm developing an open-source product which could end up on a > server without the locales for French but be used by some French > people, which would make (as far as I can get out of one comment from > Richie in the PHP manual) the transliteration somewhat wrong. With the kind of rough conversion/transformation you're doing, is the locale really very important anyway? /Per Jessen, Zürich -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Replacing accented characters by non-accented characters
Yannick: Considering that we just had a flurry of pet-peeves on the list, I rant on one of mine. At 1:25 PM -0500 5/12/08, Yannick Warnier wrote: I'm trying to give a universally-manageable directory name to an item using a free-text title. I want to avoid every type of accentuated character and everything outside of pure ASCII to make it the most portable possible. Generating a random hash is not acceptable as we want to be the most user-friendly possible. As Rocky (the flying squirrel of Bullwinkle fame) once said when a gentleman in a black suit identified himself as "Military Intelligence" -- "That sounds like a contradiction in terms." To make something as user-friendly as possible is to accommodate as many users as possible, including those who's native language is not English -- like 96 percent of the world. You may want to call whatever you are doing as an "universally-manageable directory", but it can't be if it rules out the majority of the universe (as we know it). Why not embrace Unicode and not worry about it? I suggest you read "Building Scalable We Sites" by Henderson -- specifically chapter 4, which deals with Unicode, Internationalization and Localization. Looks to me like you're trying to fit a gross into a dozen -- that's a lossy process that's probably not going to do what you want. Cheers, tedd -- --- http://sperling.com http://ancientstones.com http://earthstones.com -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Replacing accented characters by non-accented characters
2008/5/12 Yannick Warnier <[EMAIL PROTECTED]>: >> Why are you removing the accents? Why not store/process the data as >> UTF-8, which supports all the accents in all the languages, and even >> non-latin languages. You mention Arabic, which does not use accented >> latin characters (Maybe you are thinking of Turkish, Ubek or Tadjic). >> UTF-8 supports Arabic, Russian, Greek, Latin including modified >> accented letters, and almost everything else save CJK. >> >> What is your end goal? Why are you removing the accents? > > Hi Dotan, > > I'm trying to give a universally-manageable directory name to an item > using a free-text title. I want to avoid every type of accentuated > character and everything outside of pure ASCII to make it the most > portable possible. > Generating a random hash is not acceptable as we want to be the most > user-friendly possible. I suppose that is a good reason. I actually tried to come up with a user case that justifies the removal of latin accents, and couldn't. I'll remember that. Tell me, what are you doing with Hebrew, Russian, Arabic, and other non-latin scripts? If you want, I have some code that roughly transliterates Hebrew <-> Latin on the http://gibberish.co.il website. > I'm talking about Arabic not to remove accentuated characters, but in > case there would be a transliteration function that allows me to turn an > Arabic character into something similar in terms of pronunciation but in > ASCII. If it needs to be transliterated back to Arabic you will have fun with the letter forms! I can give you code that does it for Hebrew, but Hebrew only has 5 final letters, and no explicit first- or middle- forms. > So the goal is to create a directory name that is both intuitive *and* > portable for the user and the admin. The title is kept for the user, but > there is a generic shortened code that is generated following the given > title. > We used to ask for a title in a webform, but realised our users liked it > much better when we give them the possibility to generate the code > themselves, but generating one ourselves by default. > I just realised that the developer who did it seemed to make it using > html codes directly, so we end up with codes like "EACUTETEACUTE" for an > item called "été", while "ETE" would be far better. > > Yannick > > Dotan Cohen http://what-is-what.com http://gibberish.co.il א-ב-ג-ד-ה-ו-ז-ח-ט-י-ך-כ-ל-ם-מ-ן-נ-ס-ע-ף-פ-ץ-צ-ק-ר-ש-ת A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
Re: [PHP] Replacing accented characters by non-accented characters
Le lundi 12 mai 2008 à 19:07 +0300, Dotan Cohen a écrit : > 2008/5/12 Yannick Warnier <[EMAIL PROTECTED]>: > > Hello, > > > > I've been trying to find something nice to transform an accentuated > > string into a non-accentuated string. Obviously, I'm mostly playing > > inside the European languages, but any method that could transform > > arabic or asian characters to plain non-accentuated characters would be > > perfect. > > > > I have found a number of solutions, ranging from str_replace() for every > > known accentuated character to strtr() to a preg_replace() of a > > conversion of the string to html characters then removing the "&" and > > the "alteration" string (acute, grave, circ, ...). > > > > I must say the last one seems to work better because it's less affected > > by charset changes, but it still seems awfully slow to me and I would > > like to know if there is any function that exists that could do that for > > me? > > > > Yannick > > > > Why are you removing the accents? Why not store/process the data as > UTF-8, which supports all the accents in all the languages, and even > non-latin languages. You mention Arabic, which does not use accented > latin characters (Maybe you are thinking of Turkish, Ubek or Tadjic). > UTF-8 supports Arabic, Russian, Greek, Latin including modified > accented letters, and almost everything else save CJK. > > What is your end goal? Why are you removing the accents? Hi Dotan, I'm trying to give a universally-manageable directory name to an item using a free-text title. I want to avoid every type of accentuated character and everything outside of pure ASCII to make it the most portable possible. Generating a random hash is not acceptable as we want to be the most user-friendly possible. I'm talking about Arabic not to remove accentuated characters, but in case there would be a transliteration function that allows me to turn an Arabic character into something similar in terms of pronunciation but in ASCII. So the goal is to create a directory name that is both intuitive *and* portable for the user and the admin. The title is kept for the user, but there is a generic shortened code that is generated following the given title. We used to ask for a title in a webform, but realised our users liked it much better when we give them the possibility to generate the code themselves, but generating one ourselves by default. I just realised that the developer who did it seemed to make it using html codes directly, so we end up with codes like "EACUTETEACUTE" for an item called "été", while "ETE" would be far better. Yannick -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Replacing accented characters by non-accented characters
2008/5/12 Yannick Warnier <[EMAIL PROTECTED]>: > Hello, > > I've been trying to find something nice to transform an accentuated > string into a non-accentuated string. Obviously, I'm mostly playing > inside the European languages, but any method that could transform > arabic or asian characters to plain non-accentuated characters would be > perfect. > > I have found a number of solutions, ranging from str_replace() for every > known accentuated character to strtr() to a preg_replace() of a > conversion of the string to html characters then removing the "&" and > the "alteration" string (acute, grave, circ, ...). > > I must say the last one seems to work better because it's less affected > by charset changes, but it still seems awfully slow to me and I would > like to know if there is any function that exists that could do that for > me? > > Yannick > Why are you removing the accents? Why not store/process the data as UTF-8, which supports all the accents in all the languages, and even non-latin languages. You mention Arabic, which does not use accented latin characters (Maybe you are thinking of Turkish, Ubek or Tadjic). UTF-8 supports Arabic, Russian, Greek, Latin including modified accented letters, and almost everything else save CJK. What is your end goal? Why are you removing the accents? Dotan Cohen http://what-is-what.com http://gibberish.co.il א-ב-ג-ד-ה-ו-ז-ח-ט-י-ך-כ-ל-ם-מ-ן-נ-ס-ע-ף-פ-ץ-צ-ק-ר-ש-ת A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
Re: [PHP] Replacing accented characters by non-accented characters
Thanks James, That would probably work out if it wasn't too dependent on the locales to work. I'm developing an open-source product which could end up on a server without the locales for French but be used by some French people, which would make (as far as I can get out of one comment from Richie in the PHP manual) the transliteration somewhat wrong. The dependency on iconv is also a minor problem to me as we are rather using MB at the moment, but I guess I might find something similar in MB anyway. Thanks, Yannick Le lundi 12 mai 2008 à 16:28 +0100, James Dempster a écrit : > oops wrong way round > echo iconv('UTF-8', 'ISO-8859-1//TRANSLIT', 'français'); > > On Mon, May 12, 2008 at 4:27 PM, James Dempster <[EMAIL PROTECTED]> wrote: > > > maybe try iconv (http://uk.php.net/manual/en/function.iconv.php) > > e.g. > > > > echo iconv('ISO-8859-1', 'UTF-8//TRANSLIT', 'français'); > > > > -- > > /James > > > > > > On Mon, May 12, 2008 at 4:09 PM, Yannick Warnier <[EMAIL PROTECTED]> > > wrote: > > > > > Hello, > > > > > > I've been trying to find something nice to transform an accentuated > > > string into a non-accentuated string. Obviously, I'm mostly playing > > > inside the European languages, but any method that could transform > > > arabic or asian characters to plain non-accentuated characters would be > > > perfect. > > > > > > I have found a number of solutions, ranging from str_replace() for every > > > known accentuated character to strtr() to a preg_replace() of a > > > conversion of the string to html characters then removing the "&" and > > > the "alteration" string (acute, grave, circ, ...). > > > > > > I must say the last one seems to work better because it's less affected > > > by charset changes, but it still seems awfully slow to me and I would > > > like to know if there is any function that exists that could do that for > > > me? > > > > > > Yannick > > > > > > > > > -- > > > PHP General Mailing List (http://www.php.net/) > > > To unsubscribe, visit: http://www.php.net/unsub.php > > > > > > > > -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Replacing accented characters by non-accented characters
oops wrong way round echo iconv('UTF-8', 'ISO-8859-1//TRANSLIT', 'français'); On Mon, May 12, 2008 at 4:27 PM, James Dempster <[EMAIL PROTECTED]> wrote: > maybe try iconv (http://uk.php.net/manual/en/function.iconv.php) > e.g. > > echo iconv('ISO-8859-1', 'UTF-8//TRANSLIT', 'français'); > > -- > /James > > > On Mon, May 12, 2008 at 4:09 PM, Yannick Warnier <[EMAIL PROTECTED]> > wrote: > > > Hello, > > > > I've been trying to find something nice to transform an accentuated > > string into a non-accentuated string. Obviously, I'm mostly playing > > inside the European languages, but any method that could transform > > arabic or asian characters to plain non-accentuated characters would be > > perfect. > > > > I have found a number of solutions, ranging from str_replace() for every > > known accentuated character to strtr() to a preg_replace() of a > > conversion of the string to html characters then removing the "&" and > > the "alteration" string (acute, grave, circ, ...). > > > > I must say the last one seems to work better because it's less affected > > by charset changes, but it still seems awfully slow to me and I would > > like to know if there is any function that exists that could do that for > > me? > > > > Yannick > > > > > > -- > > PHP General Mailing List (http://www.php.net/) > > To unsubscribe, visit: http://www.php.net/unsub.php > > > > >
Re: [PHP] Replacing accented characters by non-accented characters
maybe try iconv (http://uk.php.net/manual/en/function.iconv.php) e.g. echo iconv('ISO-8859-1', 'UTF-8//TRANSLIT', 'français'); -- /James On Mon, May 12, 2008 at 4:09 PM, Yannick Warnier <[EMAIL PROTECTED]> wrote: > Hello, > > I've been trying to find something nice to transform an accentuated > string into a non-accentuated string. Obviously, I'm mostly playing > inside the European languages, but any method that could transform > arabic or asian characters to plain non-accentuated characters would be > perfect. > > I have found a number of solutions, ranging from str_replace() for every > known accentuated character to strtr() to a preg_replace() of a > conversion of the string to html characters then removing the "&" and > the "alteration" string (acute, grave, circ, ...). > > I must say the last one seems to work better because it's less affected > by charset changes, but it still seems awfully slow to me and I would > like to know if there is any function that exists that could do that for > me? > > Yannick > > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > >