On Wednesday 08 March 2006 01:36, Scott Cantor wrote: > > <quote url=http://doc.trolltech.com/4.1/qstring.html> > > The QString class provides a Unicode character string. > > QString stores a string of 16-bit QChars, where each QChar > > stores one Unicode 4.0 character. > > That sounds like UTF-16, but it's not using the terminology that would give > me warm fuzzies. There are, I believe, other 16-bit encodings, but I could > be mistaken about that. It's certainly worth a try, but Unicode is one of > those things where you'd need to try the hard stuff before you'd hit the > problems. > > > So, if everybody is telling the truth, can I not just do this? > > > > const XMLCh* QtoX(const QString& s) { > > return reinterpret_cast<const XMLCh*>(s.constData()); > > } > > > > const XMLCh* CtoX(const char* cs) { return QtoX(cs); } > > Absent memory management issues, it's possible, yeah. > > I'm not sure if that second function would work though. It seems like > you're counting on some auto-conversion via QString to convert the ASCII, > and then returning a cast of its internal buffer. That's a recipe for crash > city, I would think (temp object created, reference passed, pointer to > internals returned, object destroyed, pointer invalid).
I believe the second function is superfluous. Yes, I am depending on a conversion from const char* to QString. As for the pointer becoming invalid, that probably depends on what Xerces does with what I pass. If the first thing it does is copy it, then I'm home free. Or, at least that is my understanding of the C++ Standard. I'm not sure exactly where the overhead from transcoding comes in. It looks like the transcoder will look at each character individually to determine if a conversion needs to happen. That would be far more expensive than a simple "Trust me. I know this is properly encoded". IIRC, there /are/ different UTF encodings, even within UTF-16. There is something called UCS-4, and also something called UCS-2 (I believe). I do not know the difference between these and their related UTF-32 and UTF-16. Another potential source of overhead is memory allocation. Qt uses a shared memory model for QString, but I don't know what that will buy me. Steven --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
