hi people:

i am designing an application written in php which takes
xml-files as input and parses them via pear::xml_tree (based
upon pear::xml_parser which is itself based upon sax). the
values taken from the xml are later handed around among a
variety of php objects and finally renderered via a template
engine (e.g. smarty).

my basic problem is that i am not familar with how
the different encoding schemes are 'compatible' with php
and its functions.

i read that php is internally using (the dynamically
extendable) utf-8 encoding. on the other hand applications
like xmlspy and later windows appz are using usually the (double
byte encoded) utf-16 format.

my questions as a developer is now which format to use for
my xml file (i want to store data in latin [european], japanese
and chinese/taiwan and russian languages) and would love to be just to deal with one encoding gathering all languages instead of being in
need of a variety of diff encoding schemes. i would also be in need
of some more infos on datatypes.. for example: is a taiwanese word
also threatened as string in php and how to we deal with non-arabic
numerative systems?


furthermore i would love to know if there are there any
problems for regular expressions, because i guess the alphabet
of php's ereg-engine is mostly ansi based, isnt it? or maybe
i am wrong and ereg recognizes the used alphabet automatically?

basically my research via google and nec's research index
didnt result in any good papers about the subject which
dont loose themself in the vastness of details on different
encodings, byteorders and stuff.

are there any good articles for newbiews to look at?

any pointers to usefull and applyable knowledge sources
would be really appriciated.

yours,
matthias


__ http://www.parkstudios.net

--
PHP Internationalization Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Reply via email to