Re: [PHP] Replacing accented characters

2010-03-04 Thread Daniel Brown
On Thu, Mar 4, 2010 at 12:57, Skip Evans  wrote:
> Hey all,
>
> Does anyone have a function that replaces accented characters
> with the non-accent equals?

This one by Sven on 21-APR-2005:

"Ae", "\xC6"=>"AE",
"\xD6"=>"Oe", "\xDC"=>"Ue", "\xDE"=>"TH", "\xDF"=>"ss", "\xE4"=>"ae",
"\xE6"=>"ae", "\xF6"=>"oe", "\xFC"=>"ue", "\xFE"=>"th"));
return($string);
}
?>

If you search via Google for 'php accented characters function'
you'll see some user notes with code samples.  I grabbed the one above
from strtr() on php.net, and there are several others there and other
places --- like on the preg_replace() page, if memory serves.


-- 

daniel.br...@parasane.net || danbr...@php.net
http://www.parasane.net/ || http://www.pilotpig.net/
Looking for hosting or dedicated servers?  Ask me how we can fit your budget!

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Replacing accented characters

2010-03-04 Thread Ashley Sheridan
On Thu, 2010-03-04 at 11:57 -0600, Skip Evans wrote:

> Hey all,
> 
> Does anyone have a function that replaces accented characters
> with the non-accent equals?
> 
> I can't figure out how to get them out of the Ubuntu keyboard
> the way it is configured so I've just tried three different
> functions off the Internet and none of them have worked.
> 
> Thanks,
> Skip
> -- 
> 
> Skip Evans
> PenguinSites.com, LLC
> 503 S Baldwin St, #1
> Madison WI 53703
> 608.250.2720
> http://penguinsites.com
> 
> Those of you who believe in
> telekinesis, raise my hand.
>   -- Kurt Vonnegut
> 


Ubuntu should have gucharmap which should let you copy and paste
accented characters.

Thanks,
Ash
http://www.ashleysheridan.co.uk




Re: [PHP] Replacing accented characters?

2010-01-28 Thread Robert Cummings

Robert Cummings wrote:

Paul M Foster wrote:

On Thu, Jan 28, 2010 at 02:38:52PM -0500, tedd wrote:



My point was more to the theme that we are an eclectic group of people 
with a wide range of knowledge and skills. Individually we may have  
trouble finding our ass, but together we can find the answer to many 
things.

I just got this image of all of us at a table at the local Chili's
twirling around in place like dogs, trying to find our asses. 


I wish you'd stop grabbing ming!!  :B


Mine even... doh!


--
http://www.interjinn.com
Application and Templating Framework for PHP

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Replacing accented characters?

2010-01-28 Thread Robert Cummings

Paul M Foster wrote:

On Thu, Jan 28, 2010 at 02:38:52PM -0500, tedd wrote:



My point was more to the theme that we are an eclectic group of people 
with a wide range of knowledge and skills. Individually we may have  
trouble finding our ass, but together we can find the answer to many 
things.


I just got this image of all of us at a table at the local Chili's
twirling around in place like dogs, trying to find our asses. 


I wish you'd stop grabbing ming!!  :B

Cheers,
Rob.
--
http://www.interjinn.com
Application and Templating Framework for PHP

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Replacing accented characters?

2010-01-28 Thread Paul M Foster
On Thu, Jan 28, 2010 at 02:38:52PM -0500, tedd wrote:



> My point was more to the theme that we are an eclectic group of people 
> with a wide range of knowledge and skills. Individually we may have  
> trouble finding our ass, but together we can find the answer to many 
> things.

I just got this image of all of us at a table at the local Chili's
twirling around in place like dogs, trying to find our asses. 

Paul

-- 
Paul M. Foster

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Replacing accented characters?

2010-01-28 Thread tedd

At 9:28 AM -0500 1/28/10, Robert Cummings wrote:

tedd wrote:

At 12:17 PM +0100 1/28/10, Marcus Gnaß wrote:

On 28.01.2010 03:40, Paul M Foster wrote:

 On Wed, Jan 27, 2010 at 04:55:46PM -0600, Skip Evans wrote:


 Hey all,

 I'm looking for recommendations on how to replace accented
 characters, like e and u with those two little dots above
 them, with the regular e and u characters.

 FWIW, those two dots are called an "umlaut".

 Paul


FWIW, the whole letters ÄäÖöÜü are called "Umlaute" (umlauts).
The two dots above *these* letters are "Umlautzeichen" (umlaut marks).
But two dots above an e or i are called "Trema" (diacritic mark).

Marcus


On what other list could we learn that?


A linguistics list I'd wager.

:B

Cheers,
Rob.



And let's not forget the Trema list.  :-)

My point was more to the theme that we are an 
eclectic group of people with a wide range of 
knowledge and skills. Individually we may have 
trouble finding our ass, but together we can find 
the answer to many things.


Cheers,

tedd

--
---
http://sperling.com  http://ancientstones.com  http://earthstones.com

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Replacing accented characters?

2010-01-28 Thread Robert Cummings

tedd wrote:

At 12:17 PM +0100 1/28/10, Marcus Gnaß wrote:

On 28.01.2010 03:40, Paul M Foster wrote:

 On Wed, Jan 27, 2010 at 04:55:46PM -0600, Skip Evans wrote:


 Hey all,

 I'm looking for recommendations on how to replace accented
 characters, like e and u with those two little dots above
 them, with the regular e and u characters.

 FWIW, those two dots are called an "umlaut".

 Paul


FWIW, the whole letters ÄäÖöÜü are called "Umlaute" (umlauts).
The two dots above *these* letters are "Umlautzeichen" (umlaut marks).
But two dots above an e or i are called "Trema" (diacritic mark).

Marcus


On what other list could we learn that?


A linguistics list I'd wager.

:B

Cheers,
Rob.
--
http://www.interjinn.com
Application and Templating Framework for PHP

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Replacing accented characters?

2010-01-28 Thread tedd

At 12:17 PM +0100 1/28/10, Marcus Gnaß wrote:

On 28.01.2010 03:40, Paul M Foster wrote:

 On Wed, Jan 27, 2010 at 04:55:46PM -0600, Skip Evans wrote:


 Hey all,

 I'm looking for recommendations on how to replace accented
 characters, like e and u with those two little dots above
 them, with the regular e and u characters.


 FWIW, those two dots are called an "umlaut".

 Paul



FWIW, the whole letters ÄäÖöÜü are called "Umlaute" (umlauts).
The two dots above *these* letters are "Umlautzeichen" (umlaut marks).
But two dots above an e or i are called "Trema" (diacritic mark).

Marcus


On what other list could we learn that?

Thanks,

tedd

--
---
http://sperling.com  http://ancientstones.com  http://earthstones.com

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Replacing accented characters?

2010-01-28 Thread Marcus Gnaß
On 28.01.2010 03:40, Paul M Foster wrote:
> On Wed, Jan 27, 2010 at 04:55:46PM -0600, Skip Evans wrote:
> 
>> Hey all,
>>
>> I'm looking for recommendations on how to replace accented
>> characters, like e and u with those two little dots above
>> them, with the regular e and u characters.
> 
> FWIW, those two dots are called an "umlaut".
> 
> Paul
> 

FWIW, the whole letters ÄäÖöÜü are called "Umlaute" (umlauts).
The two dots above *these* letters are "Umlautzeichen" (umlaut marks).
But two dots above an e or i are called "Trema" (diacritic mark).

Marcus

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Replacing accented characters?

2010-01-27 Thread Paul M Foster
On Wed, Jan 27, 2010 at 04:55:46PM -0600, Skip Evans wrote:

> Hey all,
>
> I'm looking for recommendations on how to replace accented
> characters, like e and u with those two little dots above
> them, with the regular e and u characters.

FWIW, those two dots are called an "umlaut".

Paul

-- 
Paul M. Foster

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Replacing accented characters?

2010-01-27 Thread Skip Evans

Looks like strtr() is the way to go?

Skip

Skip Evans wrote:

Hey all,

I'm looking for recommendations on how to replace accented characters, 
like e and u with those two little dots above them, with the regular e 
and u characters.


I'm finding some solutions via Google, but would like to hear from some 
of you to hear how you handle those situations.


Thanks,
Skip



--

Skip Evans
PenguinSites.com, LLC
503 S Baldwin St, #1
Madison WI 53703
608.250.2720
http://penguinsites.com

Those of you who believe in
telekinesis, raise my hand.
 -- Kurt Vonnegut

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Replacing accented characters by non-accented characters

2008-05-14 Thread Dotan Cohen
2008/5/14 Bastien Koert <[EMAIL PROTECTED]>:
> Why should the server folder name matter? Make it a hash and store the user
> provided name in a db. Then when presenting the data to the user just show
> the user provided name as the folder name. This would also handle multiple
> users trying to use the same folder name for their stuff.
>

It may be that the whole folder hierarchy is publicly accesable. Some
faculties at my university do that.

Dotan Cohen

http://what-is-what.com
http://gibberish.co.il
א-ב-ג-ד-ה-ו-ז-ח-ט-י-ך-כ-ל-ם-מ-ן-נ-ס-ע-ף-פ-ץ-צ-ק-ר-ש-ת

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?


Re: [PHP] Replacing accented characters by non-accented characters

2008-05-14 Thread Bastien Koert
On Tue, May 13, 2008 at 8:39 AM, Per Jessen <[EMAIL PROTECTED]> wrote:

> Yannick Warnier wrote:
>
> > That would probably work out if it wasn't too dependent on the locales
> > to work. I'm developing an open-source product which could end up on a
> > server without the locales for French but be used by some French
> > people, which would make (as far as I can get out of one comment from
> > Richie in the PHP manual) the transliteration somewhat wrong.
>
> With the kind of rough conversion/transformation you're doing, is the
> locale really very important anyway?
>
>
> /Per Jessen, Zürich
>
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>
Why should the server folder name matter? Make it a hash and store the user
provided name in a db. Then when presenting the data to the user just show
the user provided name as the folder name. This would also handle multiple
users trying to use the same folder name for their stuff.

-- 

Bastien

Cat, the other other white meat


Re: [PHP] Replacing accented characters by non-accented characters

2008-05-13 Thread Per Jessen
Yannick Warnier wrote:

> That would probably work out if it wasn't too dependent on the locales
> to work. I'm developing an open-source product which could end up on a
> server without the locales for French but be used by some French
> people, which would make (as far as I can get out of one comment from
> Richie in the PHP manual) the transliteration somewhat wrong.

With the kind of rough conversion/transformation you're doing, is the
locale really very important anyway?


/Per Jessen, Zürich


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Replacing accented characters by non-accented characters

2008-05-12 Thread tedd

Yannick:

Considering that we just had a flurry of pet-peeves on the list, I 
rant on one of mine.


At 1:25 PM -0500 5/12/08, Yannick Warnier wrote:

I'm trying to give a universally-manageable directory name to an item
using a free-text title. I want to avoid every type of accentuated
character and everything outside of pure ASCII to make it the most
portable possible.
Generating a random hash is not acceptable as we want to be the most
user-friendly possible.




As Rocky (the flying squirrel of Bullwinkle fame) once said when a 
gentleman in a black suit identified himself as "Military 
Intelligence" -- "That sounds like a contradiction in terms."


To make something as user-friendly as possible is to accommodate as 
many users as possible, including those who's native language is not 
English -- like 96 percent of the world.


You may want to call whatever you are doing as an 
"universally-manageable directory", but it can't be if it rules out 
the majority of the universe (as we know it).


Why not embrace Unicode and not worry about it? I suggest you read 
"Building Scalable We Sites" by Henderson -- specifically chapter 4, 
which deals with Unicode, Internationalization and Localization.


Looks to me like you're trying to fit a gross into a dozen -- that's 
a lossy process that's probably not going to do what you want.




Cheers,

tedd
--
---
http://sperling.com  http://ancientstones.com  http://earthstones.com

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Replacing accented characters by non-accented characters

2008-05-12 Thread Dotan Cohen
2008/5/12 Yannick Warnier <[EMAIL PROTECTED]>:
>> Why are you removing the accents? Why not store/process the data as
>> UTF-8, which supports all the accents in all the languages, and even
>> non-latin languages. You mention Arabic, which does not use accented
>> latin characters (Maybe you are thinking of Turkish, Ubek or Tadjic).
>> UTF-8 supports Arabic, Russian, Greek, Latin including modified
>> accented letters, and almost everything else save CJK.
>>
>> What is your end goal? Why are you removing the accents?
>
> Hi Dotan,
>
> I'm trying to give a universally-manageable directory name to an item
> using a free-text title. I want to avoid every type of accentuated
> character and everything outside of pure ASCII to make it the most
> portable possible.
> Generating a random hash is not acceptable as we want to be the most
> user-friendly possible.

I suppose that is a good reason. I actually tried to come up with a
user case that justifies the removal of latin accents, and couldn't.
I'll remember that. Tell me, what are you doing with Hebrew, Russian,
Arabic, and other non-latin scripts? If you want, I have some code
that roughly transliterates Hebrew <-> Latin on the
http://gibberish.co.il website.

> I'm talking about Arabic not to remove accentuated characters, but in
> case there would be a transliteration function that allows me to turn an
> Arabic character into something similar in terms of pronunciation but in
> ASCII.

If it needs to be transliterated back to Arabic you will have fun with
the letter forms! I can give you code that does it for Hebrew, but
Hebrew only has 5 final letters, and no explicit first- or middle-
forms.

> So the goal is to create a directory name that is both intuitive *and*
> portable for the user and the admin. The title is kept for the user, but
> there is a generic shortened code that is generated following the given
> title.
> We used to ask for a title in a webform, but realised our users liked it
> much better when we give them the possibility to generate the code
> themselves, but generating one ourselves by default.
> I just realised that the developer who did it seemed to make it using
> html codes directly, so we end up with codes like "EACUTETEACUTE" for an
> item called "été", while "ETE" would be far better.
>
> Yannick
>
>

Dotan Cohen

http://what-is-what.com
http://gibberish.co.il
א-ב-ג-ד-ה-ו-ז-ח-ט-י-ך-כ-ל-ם-מ-ן-נ-ס-ע-ף-פ-ץ-צ-ק-ר-ש-ת

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?


Re: [PHP] Replacing accented characters by non-accented characters

2008-05-12 Thread Yannick Warnier
Le lundi 12 mai 2008 à 19:07 +0300, Dotan Cohen a écrit :
> 2008/5/12 Yannick Warnier <[EMAIL PROTECTED]>:
> > Hello,
> >
> > I've been trying to find something nice to transform an accentuated
> > string into a non-accentuated string. Obviously, I'm mostly playing
> > inside the European languages, but any method that could transform
> > arabic or asian characters to plain non-accentuated characters would be
> > perfect.
> >
> > I have found a number of solutions, ranging from str_replace() for every
> > known accentuated character to strtr() to a preg_replace() of a
> > conversion of the string to html characters then removing the "&" and
> > the "alteration" string (acute, grave, circ, ...).
> >
> > I must say the last one seems to work better because it's less affected
> > by charset changes, but it still seems awfully slow to me and I would
> > like to know if there is any function that exists that could do that for
> > me?
> >
> > Yannick
> >
> 
> Why are you removing the accents? Why not store/process the data as
> UTF-8, which supports all the accents in all the languages, and even
> non-latin languages. You mention Arabic, which does not use accented
> latin characters (Maybe you are thinking of Turkish, Ubek or Tadjic).
> UTF-8 supports Arabic, Russian, Greek, Latin including modified
> accented letters, and almost everything else save CJK.
> 
> What is your end goal? Why are you removing the accents?

Hi Dotan,

I'm trying to give a universally-manageable directory name to an item
using a free-text title. I want to avoid every type of accentuated
character and everything outside of pure ASCII to make it the most
portable possible.
Generating a random hash is not acceptable as we want to be the most
user-friendly possible.

I'm talking about Arabic not to remove accentuated characters, but in
case there would be a transliteration function that allows me to turn an
Arabic character into something similar in terms of pronunciation but in
ASCII.

So the goal is to create a directory name that is both intuitive *and*
portable for the user and the admin. The title is kept for the user, but
there is a generic shortened code that is generated following the given
title.
We used to ask for a title in a webform, but realised our users liked it
much better when we give them the possibility to generate the code
themselves, but generating one ourselves by default.
I just realised that the developer who did it seemed to make it using
html codes directly, so we end up with codes like "EACUTETEACUTE" for an
item called "été", while "ETE" would be far better.

Yannick


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Replacing accented characters by non-accented characters

2008-05-12 Thread Dotan Cohen
2008/5/12 Yannick Warnier <[EMAIL PROTECTED]>:
> Hello,
>
> I've been trying to find something nice to transform an accentuated
> string into a non-accentuated string. Obviously, I'm mostly playing
> inside the European languages, but any method that could transform
> arabic or asian characters to plain non-accentuated characters would be
> perfect.
>
> I have found a number of solutions, ranging from str_replace() for every
> known accentuated character to strtr() to a preg_replace() of a
> conversion of the string to html characters then removing the "&" and
> the "alteration" string (acute, grave, circ, ...).
>
> I must say the last one seems to work better because it's less affected
> by charset changes, but it still seems awfully slow to me and I would
> like to know if there is any function that exists that could do that for
> me?
>
> Yannick
>

Why are you removing the accents? Why not store/process the data as
UTF-8, which supports all the accents in all the languages, and even
non-latin languages. You mention Arabic, which does not use accented
latin characters (Maybe you are thinking of Turkish, Ubek or Tadjic).
UTF-8 supports Arabic, Russian, Greek, Latin including modified
accented letters, and almost everything else save CJK.

What is your end goal? Why are you removing the accents?

Dotan Cohen

http://what-is-what.com
http://gibberish.co.il
א-ב-ג-ד-ה-ו-ז-ח-ט-י-ך-כ-ל-ם-מ-ן-נ-ס-ע-ף-פ-ץ-צ-ק-ר-ש-ת

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?


Re: [PHP] Replacing accented characters by non-accented characters

2008-05-12 Thread Yannick Warnier
Thanks James,

That would probably work out if it wasn't too dependent on the locales
to work. I'm developing an open-source product which could end up on a
server without the locales for French but be used by some French people,
which would make (as far as I can get out of one comment from Richie in
the PHP manual) the transliteration somewhat wrong.

The dependency on iconv is also a minor problem to me as we are rather
using MB at the moment, but I guess I might find something similar in MB
anyway.

Thanks,

Yannick


Le lundi 12 mai 2008 à 16:28 +0100, James Dempster a écrit :
> oops wrong way round
> echo iconv('UTF-8', 'ISO-8859-1//TRANSLIT', 'français');
> 
> On Mon, May 12, 2008 at 4:27 PM, James Dempster <[EMAIL PROTECTED]> wrote:
> 
> > maybe try iconv (http://uk.php.net/manual/en/function.iconv.php)
> > e.g.
> >
> > echo iconv('ISO-8859-1', 'UTF-8//TRANSLIT', 'français');
> >
> > --
> > /James
> >
> >
> > On Mon, May 12, 2008 at 4:09 PM, Yannick Warnier <[EMAIL PROTECTED]>
> > wrote:
> >
> > > Hello,
> > >
> > > I've been trying to find something nice to transform an accentuated
> > > string into a non-accentuated string. Obviously, I'm mostly playing
> > > inside the European languages, but any method that could transform
> > > arabic or asian characters to plain non-accentuated characters would be
> > > perfect.
> > >
> > > I have found a number of solutions, ranging from str_replace() for every
> > > known accentuated character to strtr() to a preg_replace() of a
> > > conversion of the string to html characters then removing the "&" and
> > > the "alteration" string (acute, grave, circ, ...).
> > >
> > > I must say the last one seems to work better because it's less affected
> > > by charset changes, but it still seems awfully slow to me and I would
> > > like to know if there is any function that exists that could do that for
> > > me?
> > >
> > > Yannick
> > >
> > >
> > > --
> > > PHP General Mailing List (http://www.php.net/)
> > > To unsubscribe, visit: http://www.php.net/unsub.php
> > >
> > >
> >


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Replacing accented characters by non-accented characters

2008-05-12 Thread James Dempster
oops wrong way round
echo iconv('UTF-8', 'ISO-8859-1//TRANSLIT', 'français');

On Mon, May 12, 2008 at 4:27 PM, James Dempster <[EMAIL PROTECTED]> wrote:

> maybe try iconv (http://uk.php.net/manual/en/function.iconv.php)
> e.g.
>
> echo iconv('ISO-8859-1', 'UTF-8//TRANSLIT', 'français');
>
> --
> /James
>
>
> On Mon, May 12, 2008 at 4:09 PM, Yannick Warnier <[EMAIL PROTECTED]>
> wrote:
>
> > Hello,
> >
> > I've been trying to find something nice to transform an accentuated
> > string into a non-accentuated string. Obviously, I'm mostly playing
> > inside the European languages, but any method that could transform
> > arabic or asian characters to plain non-accentuated characters would be
> > perfect.
> >
> > I have found a number of solutions, ranging from str_replace() for every
> > known accentuated character to strtr() to a preg_replace() of a
> > conversion of the string to html characters then removing the "&" and
> > the "alteration" string (acute, grave, circ, ...).
> >
> > I must say the last one seems to work better because it's less affected
> > by charset changes, but it still seems awfully slow to me and I would
> > like to know if there is any function that exists that could do that for
> > me?
> >
> > Yannick
> >
> >
> > --
> > PHP General Mailing List (http://www.php.net/)
> > To unsubscribe, visit: http://www.php.net/unsub.php
> >
> >
>


Re: [PHP] Replacing accented characters by non-accented characters

2008-05-12 Thread James Dempster
maybe try iconv (http://uk.php.net/manual/en/function.iconv.php)
e.g.

echo iconv('ISO-8859-1', 'UTF-8//TRANSLIT', 'français');

--
/James

On Mon, May 12, 2008 at 4:09 PM, Yannick Warnier <[EMAIL PROTECTED]>
wrote:

> Hello,
>
> I've been trying to find something nice to transform an accentuated
> string into a non-accentuated string. Obviously, I'm mostly playing
> inside the European languages, but any method that could transform
> arabic or asian characters to plain non-accentuated characters would be
> perfect.
>
> I have found a number of solutions, ranging from str_replace() for every
> known accentuated character to strtr() to a preg_replace() of a
> conversion of the string to html characters then removing the "&" and
> the "alteration" string (acute, grave, circ, ...).
>
> I must say the last one seems to work better because it's less affected
> by charset changes, but it still seems awfully slow to me and I would
> like to know if there is any function that exists that could do that for
> me?
>
> Yannick
>
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>