Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2008-03-11 Thread Bruce Momjian
Added to TODO: * Change memory allocation for multi-byte functions so memory is allocated inside conversion functions Currently we preallocate memory based on worst-case usage. --- Tom Lane wrote: > Tatsuo Ishii <[EMA

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-07-18 Thread Bruce Momjian
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Tatsuo Ishii wrote: > The conclusion of the discussion appears that we could reduce > MAX_CONVERSION_GROWTH from 4 to

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-07-18 Thread Tatsuo Ishii
The conclusion of the discussion appears that we could reduce MAX_CONVERSION_GROWTH from 4 to 3 safely with all existing built-in conversions. However, since user defined conversions could set arbitrary growth rate, probably it would be better leave it as it is now. For 8.4, maybe we could change

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-07-18 Thread Tatsuo Ishii
Sorry for dealy. > On Tue, May 29, 2007 20:51, Tatsuo Ishii wrote: > > > Thinking more, it striked me that users can define arbitarily growing > > rate by using CFREATE CONVERSION. So it seems we need functionality to > > define the growing rate anyway. > > Would it make sense to define just the

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-07-16 Thread Bruce Momjian
Where are we on this? --- Tom Lane wrote: > I just rearranged the code in mbutils.c a little bit to make it more > robust if conversion of an over-length string is attempted, and noted > this comment: > > /* > * When conve

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-29 Thread Jeroen T. Vermeulen
On Tue, May 29, 2007 20:51, Tatsuo Ishii wrote: > Thinking more, it striked me that users can define arbitarily growing > rate by using CFREATE CONVERSION. So it seems we need functionality to > define the growing rate anyway. Would it make sense to define just the longest and shortest character

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-29 Thread Michael Fuhr
On Tue, May 29, 2007 at 10:00:06AM -0400, Tom Lane wrote: > In practice though, I find it hard to imagine a pair of encodings for > which the growth rate is more than 3x. You'd need something that > translates a single-byte character into 4 or more bytes (pretty > unlikely, especially considering

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-29 Thread Tom Lane
Tatsuo Ishii <[EMAIL PROTECTED]> writes: > Thinking more, it striked me that users can define arbitarily growing > rate by using CFREATE CONVERSION. So it seems we need functionality to > define the growing rate anyway. Seems to me that would be an argument for moving the palloc inside the convers

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-29 Thread Tatsuo Ishii
> > On Mon, May 28, 2007 at 10:23:42PM -0400, Tom Lane wrote: > > > Tatsuo Ishii <[EMAIL PROTECTED]> writes: > > > > I'm afraid we have to mke it larger, rather than smaller for 8.3. For > > > > example 0x82f5 in SHIFT_JIS_2004 (new in 8.3) becomes *pair* of 3 > > > > bytes UTF_8 (0x00e3818b and 0x

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-28 Thread Tatsuo Ishii
> On Mon, May 28, 2007 at 10:23:42PM -0400, Tom Lane wrote: > > Tatsuo Ishii <[EMAIL PROTECTED]> writes: > > > I'm afraid we have to mke it larger, rather than smaller for 8.3. For > > > example 0x82f5 in SHIFT_JIS_2004 (new in 8.3) becomes *pair* of 3 > > > bytes UTF_8 (0x00e3818b and 0x00e3829a).

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-28 Thread Michael Fuhr
On Mon, May 28, 2007 at 10:23:42PM -0400, Tom Lane wrote: > Tatsuo Ishii <[EMAIL PROTECTED]> writes: > > I'm afraid we have to mke it larger, rather than smaller for 8.3. For > > example 0x82f5 in SHIFT_JIS_2004 (new in 8.3) becomes *pair* of 3 > > bytes UTF_8 (0x00e3818b and 0x00e3829a). See > > u

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-28 Thread Tatsuo Ishii
> > Can we add a column to pg_conversion which represents the "growth > > rate"? This would reduce the rate for most encodings much smaller than > > 6. > > We need to do something, but the pg_conversion catalog seems a bad place > to put the info --- don't we have places that need to be able to do

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-28 Thread Tom Lane
Tatsuo Ishii <[EMAIL PROTECTED]> writes: > I'm afraid we have to mke it larger, rather than smaller for 8.3. For > example 0x82f5 in SHIFT_JIS_2004 (new in 8.3) becomes *pair* of 3 > bytes UTF_8 (0x00e3818b and 0x00e3829a). See > util/mb/Unicode/shift_jis_2004_to_utf8_combined.map for more details.

Re: [HACKERS] What is the maximum encoding-conversion growth rate, anyway?

2007-05-28 Thread Tatsuo Ishii
> I just rearranged the code in mbutils.c a little bit to make it more > robust if conversion of an over-length string is attempted, and noted > this comment: > > /* > * When converting strings between different encodings, we assume that space > * for converted result is 4-to-1 growth in the wor