Hi Eugenio, > I need to distribute an application that potentially can be used with > many different DBMSs (such as MySQL, PostgreSQL, SQLite, Microsoft SQL > Server). The charset used in the databases can be ANY. > > I would like to always output UTF-8 text when possible and my > questions are about the current best practices to handle this kind of > application with PHP. > > 1) As far as I know, PHP still doesn't support natively utf-8 so to > avoid problems with string functions, I still have to use mbstring > fucntions, am I right? What does PHP 5.4 change about that?
AFAIK, correct, and there hasn't been many significant changes with this recently. > 2) How to handle the fact that the data I receive from the database > can be stored using any possible charset? Do I need iconv functions > and convert everything in utf-8? And then convert it back in the > original charset when I have to write to the DB? I'd be interested to hear other's thoughts, but the general consensus these days is "convert all to UTF-8". Is there an application-requirement-reason that you'd need to convert data to a different charset at different times? In general: 1. Raw data (any charset/encoding) 2. Detect and convert to UTF 8, clean-up, etc. 3. Store in database/etc 4. Read/display in UTF 8 This should support the vast majority of written human languages, though I believe there are some exceptions. H _______________________________________________ New York PHP User Group Community Talk Mailing List http://lists.nyphp.org/mailman/listinfo/talk http://www.nyphp.org/show-participation