Le lundi 30 mars 2015 18:45:41 Jim Nasby a écrit :
> On 3/30/15 5:46 AM, Ronan Dunklau wrote:
> > Hello hackers,
> > 
> > I've tried my luck on pgsql-bugs before, with no success, so I report
> > these
> > problem here.
> > 
> > The documentation mentions the following limits for sizes:
> > 
> > Maximum Field Size      1 GB
> > Maximum Row Size        1.6 TB
> > 
> > However, it seems like rows bigger than 1GB can't be COPYed out:
> > 
> > ro=# create table test_text (c1 text, c2 text);
> > CREATE TABLE
> > ro=# insert into test_text (c1) VALUES (repeat('a', 536870912));
> > INSERT 0 1
> > ro=# update test_text set c2 = c1;
> > UPDATE 1
> > 
> > Then, trying to dump or copy that results in the following error:
> > 
> > ro=# COPY test_text TO '/tmp/test';
> > ERROR:  out of memory
> > DÉTAIL : Cannot enlarge string buffer containing 536870913 bytes by
> > 536870912 more bytes.
> > 
> > In fact, the same thing happens when using a simple SELECT:
> > 
> > ro=# select * from test_text ;
> > ERROR:  out of memory
> > DÉTAIL : Cannot enlarge string buffer containing 536870922 bytes by
> > 536870912 more bytes.
> > 
> > In the case of COPY, the server uses a StringInfo to output the row. The
> > problem is, a StringInfo is capped to MAX_ALLOC_SIZE (1GB - 1), but a row
> > should be able to hold much more than that.
> 
> Yeah, shoving a whole row into one StringInfo is ultimately going to
> limit a row to 1G, which is a far cry from what the docs claim. There's
> also going to be problems with FE/BE communications, because things like
> pq_sendbyte all use StringInfo as a buffer too. So while Postgres can
> store a 1.6TB row, you're going to find a bunch of stuff that doesn't
> work past around 1GB.
> 
> > So, is this a bug ? Or is there a caveat I would have missed in the
> > documentation ?
> 
> I suppose that really depends on your point of view. The real question
> is whether we think it's worth fixing, or a good idea to change the
> behavior of StringInfo.
> 

> StringInfo uses int's to store length, so it could possibly be changed,
> but then you'd just error out due to MaxAllocSize.
> 
> Now perhaps those could both be relaxed, but certainly not to the extent
> that you can shove an entire 1.6TB row into an output buffer.

Another way to look at it would be to work in small chunks. For the first test 
case (rows bigger than 1GB), maybe the copy command could be rewritten to work 
in chunks, flushing the output more often if needed.

For the conversion related issues, I don't really see any other solution than 
extending StrinigInfo to allow for more than 1GB of data. On the other hand, 
those one can easily be circumvented by using a COPY ... WITH binary.

> 
> The other issue is that there's a LOT of places in code that blindly
> copy detoasted data around, so while we technically support 1GB toasted
> values you're probably going to be quite unhappy with performance. I'm
> actually surprised you haven't already seen this with 500MB objects.
> 
> So long story short, I'm not sure how worthwhile it would be to try and
> fix this. We probably should improve the docs though.
> 

I think that having data that can't be output by pg_dump is quite surprising, 
and if this is not fixable, I agree that it should clearly be documented.

> Have you looked at using large objects for what you're doing? (Note that
> those have their own set of challenges and limitations.)

Yes I do. This particular customer of ours did not mind the performance 
penalty of using bytea objects as long as it was convenient to use. 

> 
> > We also hit a second issue, this time related to bytea encoding.
> 
> There's probably several other places this type of thing could be a
> problem. I'm thinking of conversions in particular.

Yes, thats what the two other test cases I mentioned are about: any conversion 
leadng to a size greater than 1GB results in an error, even implicit 
conversions like doubling antislashes in the output.

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to