Dear Tatsuo, > > > 1) a character is not always represented on a terminal propotional to > > > the storage size. For example a kanji character in UTF-8 encoding > > > has a storage size of 3 bytes while it occupies spaces only twice > > > of ASCII characters on a terminal. Same thing can be said to LATIN > > > 2,3 etc. in UTF-8 perhaps. > > > > I thought I dealt with that in the code by calling PQmblen for every char. > > Am I wrong ? > > PQmblen returns the storage size, which is not necessarily same as the > character width reprensented in a terminal. For example for a kanji > character in UTF-8 PQmblen returns 3, but it ocuppies 2 x ASCII > character space, not x 3. Isn't that a problem for you?
If I read you correctly, you mean that 1 character may take 3 bytes of storage in the string, but it is not guaranteed to be 1 character from the terminal perspective... Argh, that's definitely an issue:-( I assumed that one character whatever the encoding would be 1 character on the display. If it is not the case, I think I can put/compute this information in the translation structures that is use by PQmblen, and implement a PQmbtermlen function... Maybe you could point me some source of information about display lengths of characters depending on the encoding? > > What I mean by "ASCII compatible" is that spaces, new lines, carriage > > returns, tabs and NULL (C string terminaison) are one byte characters. > > This assumption seemed pretty safe to me. > > I think you can do it safely using PQmblen. Ok, what you describe is basically what I've done with the qidx computation as suggested by Tom Lane and then later I check that the encoded length is one to find my special characters. Thanks for you reply, -- Fabien Coelho - [EMAIL PROTECTED] ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster