A classic general overview (on the topic of "what the heck ARE character sets???"):
http://www.joelonsoftware.com/articles/Unicode.html On Wed, Dec 16, 2009 at 11:02 AM, Ken Irwin <kir...@wittenberg.edu> wrote: > Hi all, > > I'm looking for a good source to help me understand character sets and how to > use them. I pretty much know nothing about this - the whole world of Unicode, > ASCII, octal, UTF-8, etc. is baffling to me. > > My immediate issue is that I think I need to integrate data from a variety of > character sets into one MySQL table - I expect I need some way to convert > from one to another, but I don't really even know how to tell which data are > in which format. > > Our homegrown journal list (akin to SerialsSolutions) includes data ingested > from publishers, vendors, the library catalog (III), etc. When I look at the > data in emacs, some of it renders like this: > Revista de Oncolog\303\255a [slashes-and-digits instead of > diacritics] > And other data looks more like: > Revista de Música Latinoamericana [weird characters instead of > diacritics] > > My MySQL table is currently set up with the collation set to: utf8-bin , and > the titles from the second category (weird characters display in emacs) > render properly when the database data is output to the a web browser. The > data from the former example (\###) renders as an "I don't know what > character this is" placeholder in Firefox and IE. > > So, can someone please point me toward any or all of the following? > > · A good primer for understanding all of this stuff > > · A method for converting all of my data to the same character set so > it plays nicely in the database > > · The names of which character-sets I might be working with here > > Many thanks! > > Ken >