Dear all, in a current project, I have to deal with many strings, some of which are iso-8559-1, and some of which are various flavours of unicode. I've taken the good advice of the list and I store all of these strings as UTF8 for internal use, but now I have another problem.

The spec for what I'm doing (an ID3 tagging library), requires that some of the strings to be written out into a tag must be iso 8559-1, and some may be either iso 8559-1 or UTF16...so my question is:

Given any UTF8 string, can it be determined whether the string can be properly represented as iso 8559-1 (single byte chars) or whether UTF16 (double byte chars) is needed?

I could simply save all strings that the spec allows as UTF16, but this is likely to produce considerably larger tags, and would be rather against the spirit of the spec, which explicitly aims to be 'bye-efficient'.

Any thoughts on this gratefully recieved.

Best,

Mark
_______________________________________________
use-revolution mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to