Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2008-03-11 Thread Bruce Momjian
Added to TODO: * Change memory allocation for multi-byte functions so memory is allocated inside conversion functions Currently we preallocate memory based on worst-case usage. --- Tom Lane wrote: Tatsuo Ishii

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-07-18 Thread Tatsuo Ishii
Sorry for dealy. On Tue, May 29, 2007 20:51, Tatsuo Ishii wrote: Thinking more, it striked me that users can define arbitarily growing rate by using CFREATE CONVERSION. So it seems we need functionality to define the growing rate anyway. Would it make sense to define just the longest

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-07-18 Thread Tatsuo Ishii
The conclusion of the discussion appears that we could reduce MAX_CONVERSION_GROWTH from 4 to 3 safely with all existing built-in conversions. However, since user defined conversions could set arbitrary growth rate, probably it would be better leave it as it is now. For 8.4, maybe we could

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-07-18 Thread Bruce Momjian
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Tatsuo Ishii wrote: The conclusion of the discussion appears that we could reduce MAX_CONVERSION_GROWTH from 4 to

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-07-16 Thread Bruce Momjian
Where are we on this? --- Tom Lane wrote: I just rearranged the code in mbutils.c a little bit to make it more robust if conversion of an over-length string is attempted, and noted this comment: /* * When converting

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-29 Thread Tatsuo Ishii
On Mon, May 28, 2007 at 10:23:42PM -0400, Tom Lane wrote: Tatsuo Ishii [EMAIL PROTECTED] writes: I'm afraid we have to mke it larger, rather than smaller for 8.3. For example 0x82f5 in SHIFT_JIS_2004 (new in 8.3) becomes *pair* of 3 bytes UTF_8 (0x00e3818b and 0x00e3829a). See

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-29 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes: Thinking more, it striked me that users can define arbitarily growing rate by using CFREATE CONVERSION. So it seems we need functionality to define the growing rate anyway. Seems to me that would be an argument for moving the palloc inside the conversion

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-29 Thread Michael Fuhr
On Tue, May 29, 2007 at 10:00:06AM -0400, Tom Lane wrote: In practice though, I find it hard to imagine a pair of encodings for which the growth rate is more than 3x. You'd need something that translates a single-byte character into 4 or more bytes (pretty unlikely, especially considering we

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-29 Thread Jeroen T. Vermeulen
On Tue, May 29, 2007 20:51, Tatsuo Ishii wrote: Thinking more, it striked me that users can define arbitarily growing rate by using CFREATE CONVERSION. So it seems we need functionality to define the growing rate anyway. Would it make sense to define just the longest and shortest character

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-28 Thread Tatsuo Ishii
I just rearranged the code in mbutils.c a little bit to make it more robust if conversion of an over-length string is attempted, and noted this comment: /* * When converting strings between different encodings, we assume that space * for converted result is 4-to-1 growth in the worst

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-28 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes: I'm afraid we have to mke it larger, rather than smaller for 8.3. For example 0x82f5 in SHIFT_JIS_2004 (new in 8.3) becomes *pair* of 3 bytes UTF_8 (0x00e3818b and 0x00e3829a). See util/mb/Unicode/shift_jis_2004_to_utf8_combined.map for more details. So

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-28 Thread Tatsuo Ishii
Can we add a column to pg_conversion which represents the growth rate? This would reduce the rate for most encodings much smaller than 6. We need to do something, but the pg_conversion catalog seems a bad place to put the info --- don't we have places that need to be able to do

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-28 Thread Michael Fuhr
On Mon, May 28, 2007 at 10:23:42PM -0400, Tom Lane wrote: Tatsuo Ishii [EMAIL PROTECTED] writes: I'm afraid we have to mke it larger, rather than smaller for 8.3. For example 0x82f5 in SHIFT_JIS_2004 (new in 8.3) becomes *pair* of 3 bytes UTF_8 (0x00e3818b and 0x00e3829a). See

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-28 Thread Tatsuo Ishii
On Mon, May 28, 2007 at 10:23:42PM -0400, Tom Lane wrote: Tatsuo Ishii [EMAIL PROTECTED] writes: I'm afraid we have to mke it larger, rather than smaller for 8.3. For example 0x82f5 in SHIFT_JIS_2004 (new in 8.3) becomes *pair* of 3 bytes UTF_8 (0x00e3818b and 0x00e3829a). See