Re: [HACKERS] pg_do_encoding_conversion glitch

2008-11-10 Thread Tom Lane
ITAGAKI Takahiro [EMAIL PROTECTED] writes:
 I have a question about the result contract of pg_do_encoding_conversion().

It's poorly designed :-(

As near as I can tell, all uses of the function either pass a
null-terminated string or special-case the result == src case in such a
way that it doesn't matter.  However it's certainly not obvious that you
have to do that.

The calls in contrib/sslinfo might be broken --- not sure how much
that module has been tested.

Do you have a proposal for a different API, or do you just want to
improve the comment for the function?  Bear in mind that a lot of the
callers do know the string length, and so we shouldn't impose an
unnecessary strlen() operation on those cases.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_do_encoding_conversion glitch

2008-11-10 Thread ITAGAKI Takahiro
I have a question about the result contract of pg_do_encoding_conversion().
It can receive non null-terminated string because its arguments are
a char array and a byte length.
And it only returns a string, so the string should be null-terminated.

However, if conversions are not required, the function returns
the input string itself even though it might be not null-terminated.

I checked usages of pg_do_encoding_conversion() and xml_parse()
could cause troubles. Is it a bug? needed to be fixed?


 [utils/mb/mbutils.c]
unsigned char *
pg_do_encoding_conversion(unsigned char *src, int len,
  int src_encoding, int dest_encoding)
{
...
if (src_encoding == dest_encoding)
return src;


 [utils/adt/xml.c]
static xmlDocPtr
xml_parse(text *data, XmlOptionType xmloption_arg, bool preserve_whitespace,
  xmlChar * encoding)
{
...
len = VARSIZE(data) - VARHDRSZ; /* will be useful later */
string = xml_text2xmlChar(data);

utf8string = pg_do_encoding_conversion(string,
   len,
   encoding ?
   xmlChar_to_encoding(encoding) :
   [It could be UTF8 to UTF8] --  GetDatabaseEncoding(),
   PG_UTF8);


Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_do_encoding_conversion glitch

2008-11-10 Thread ITAGAKI Takahiro

Tom Lane [EMAIL PROTECTED] wrote:

 Do you have a proposal for a different API, or do you just want to
 improve the comment for the function?  Bear in mind that a lot of the
 callers do know the string length, and so we shouldn't impose an
 unnecessary strlen() operation on those cases.

We already have the following comment, so I think a new comment is
not needed. 
| In the case of no conversion, src is returned.

Since Assert() is not available in the case, developers should use
the function carefully after all. My patch might be fixed, too...
| Solve a problem of LC_TIME of windows
| http://archives.postgresql.org/message-id/[EMAIL PROTECTED]

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_do_encoding_conversion glitch

2008-11-10 Thread Tom Lane
ITAGAKI Takahiro [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] wrote:
 Do you have a proposal for a different API, or do you just want to
 improve the comment for the function?  Bear in mind that a lot of the
 callers do know the string length, and so we shouldn't impose an
 unnecessary strlen() operation on those cases.

 We already have the following comment, so I think a new comment is
 not needed. 
 | In the case of no conversion, src is returned.

I added a few more lines just to help people out --- in digging around
earlier today, there seemed to be a few places that hadn't been very
careful about this.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers