Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 4/28/15 5:41 AM, José Luis Tallón wrote: On 04/27/2015 08:49 AM, Jim Nasby wrote: On 4/25/15 1:19 PM, Bruce Momjian wrote: Note if you are storing a table with rows that exceed 2KB in size (aggregate size of each row) then the Maximum number of rows in a table may be limited to 4 Billion, see TOAST. That's not accurate though; you could be limited to far less than 4B rows. If each row has 10 fields that toast, you'd be limited to just 400M rows. ISTM like the solution is almost here, and could be done without too much (additional) work: * We have already discussed having a page-per-sequence with the new SeqAMs being introduced and how that would improve scalability. * We have commented on having a sequence per TOAST table (hence, 4B toasted values per table each up to 4B chunks in size... vs just 4B toasted values per cluster) I'm not sure that I can do it all by myself just yet, but I sure can try if there is interest. I don't think it would be hard at all to switch toast pointers to being sequence generated instead of OIDs. The only potential downside I see is the extra space required for all the sequnces... but that would only matter on the tinyest of clusters (think embedded), which probably don't have that many tables to begin with. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 04/27/2015 08:49 AM, Jim Nasby wrote: On 4/25/15 1:19 PM, Bruce Momjian wrote: Note if you are storing a table with rows that exceed 2KB in size (aggregate size of each row) then the Maximum number of rows in a table may be limited to 4 Billion, see TOAST. That's not accurate though; you could be limited to far less than 4B rows. If each row has 10 fields that toast, you'd be limited to just 400M rows. ISTM like the solution is almost here, and could be done without too much (additional) work: * We have already discussed having a page-per-sequence with the new SeqAMs being introduced and how that would improve scalability. * We have commented on having a sequence per TOAST table (hence, 4B toasted values per table each up to 4B chunks in size... vs just 4B toasted values per cluster) I'm not sure that I can do it all by myself just yet, but I sure can try if there is interest. (just after I'm done with another patch that is independent from this, though) This would be material for 9.6, of course :) Thanks, J.L. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 4/25/15 1:19 PM, Bruce Momjian wrote: Note if you are storing a table with rows that exceed 2KB in size (aggregate size of each row) then the Maximum number of rows in a table may be limited to 4 Billion, see TOAST. That's not accurate though; you could be limited to far less than 4B rows. If each row has 10 fields that toast, you'd be limited to just 400M rows. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 4/27/15, Jim Nasby jim.na...@bluetreble.com wrote: On 4/25/15 1:19 PM, Bruce Momjian wrote: Note if you are storing a table with rows that exceed 2KB in size (aggregate size of each row) then the Maximum number of rows in a table may be limited to 4 Billion, see TOAST. That's not accurate though; you could be limited to far less than 4B rows. If each row has 10 fields that toast, you'd be limited to just 400M rows. Good point. I noted that on the TOAST wiki page now, at least (and also mentioned that using partitioning is a work around for now). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 25/04/15 06:39, Jim Nasby wrote: On 4/24/15 7:11 PM, Álvaro Hernández Tortosa wrote: On 24/04/15 05:24, Tom Lane wrote: ... TBH, I've got very little enthusiasm for fixing this given the number of reports of trouble from the field, which so far as I recall is zero. Álvaro's case came up through intentionally trying to create an unreasonable number of tables, not from real usage. This thread likewise appears to contain lots of speculation and no reports of anyone hitting a problem in practice. It is certainly true that this was a very synthetic case. I envision, however, certain use cases where we may hit a very large number of tables: The original case has NOTHING to do with the number of tables and everything to do with the number of toasted values a table can have. If you have to toast 4B attributes in a single relation it will fail. In reality, if you get anywhere close to that things will fall apart due to OID conflicts. This case isn't nearly as insane as 4B tables. A table storing 10 text fields each of which is 2K would hit this limit with only 400M rows. If my math is right that's only 8TB; certainly not anything insane space-wise or rowcount-wise. Perhaps it's still not fixing, but I think it's definitely worth documenting. They are definitely different problems, but caused by similar symptoms: an oid wrapping around, or not even there: just trying to find an unused one. If fixed, we should probably look at both at the same time. It's worth document but also, as I said, maybe also fixing them, so that if three years from now they really show up, solution is already in production (rather than in patching state). Regards, Álvaro -- Álvaro Hernández Tortosa --- 8Kdata -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
* Álvaro Hernández Tortosa (a...@8kdata.com) wrote: It's worth document but also, as I said, maybe also fixing them, so that if three years from now they really show up, solution is already in production (rather than in patching state). With the proliferation of JSON usage in PG thanks to jsonb, I'd count us lucky if we don't get complaints about this in the next three years. I don't expect to have time to work on it in the near future, unfortunately, but Robert's thoughts on supporting a new TOAST pointer structure (with a way to support what's currently there, to avoid an on-disk break) seems like a good starting point to me. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On Fri, Apr 24, 2015 at 11:39:04PM -0500, Jim Nasby wrote: On 4/24/15 7:11 PM, Álvaro Hernández Tortosa wrote: On 24/04/15 05:24, Tom Lane wrote: ... TBH, I've got very little enthusiasm for fixing this given the number of reports of trouble from the field, which so far as I recall is zero. Álvaro's case came up through intentionally trying to create an unreasonable number of tables, not from real usage. This thread likewise appears to contain lots of speculation and no reports of anyone hitting a problem in practice. It is certainly true that this was a very synthetic case. I envision, however, certain use cases where we may hit a very large number of tables: The original case has NOTHING to do with the number of tables and everything to do with the number of toasted values a table can have. If you have to toast 4B attributes in a single relation it will fail. In reality, if you get anywhere close to that things will fall apart due to OID conflicts. This case isn't nearly as insane as 4B tables. A table storing 10 text fields each of which is 2K would hit this limit with only 400M rows. If my math is right that's only 8TB; certainly not anything insane space-wise or rowcount-wise. Perhaps it's still not fixing, but I think it's definitely worth documenting. And it is now documented in the Postgres FAQ thanks to 'Rogerdpack', which is where that maximum table came from: https://wiki.postgresql.org/wiki/FAQ#What_is_the_maximum_size_for_a_row.2C_a_table.2C_and_a_database.3F Note if you are storing a table with rows that exceed 2KB in size (aggregate size of each row) then the Maximum number of rows in a table may be limited to 4 Billion, see TOAST. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On Thu, Apr 23, 2015 at 11:24 PM, Tom Lane t...@sss.pgh.pa.us wrote: Having said all that, if we did try to fix it today, I'd imagine changing TOAST value identifiers to int64 and inventing a new TOAST pointer format for use when 32 bits isn't wide enough for the ID. But I think we're best advised to hold off doing that until the need becomes pressing. Just out of curiosity, has anyone thought about inventing a new TOAST pointer format on the grounds that our TOAST pointers are unreasonably large? IIUC, a TOAST pointer right now is 18 bytes: 16 for a varatt_external, and then that gets embedded in a varattrib_1b_e with a va_header byte and a va_tag byte. Eliminating one or both of va_rawsize and va_extsize from the TOAST pointer itself seems like it could save quite a bit of space on disk. Maybe you could even find a way to get rid of va_toastrelid; after all, at the point when you first acquire a pointer to the tuple, you surely know what relation it's a part of. You'd probably want to force de-TOASTing (or converting to a more expressive form of TOAST pointer, anyway) when you extracted the column from the tuple, which might be hard to arrange. But the benefits could be pretty significant. Suppose you have a table where each tuple is 4K untoasted, with all but 100 bytes of that in a single column. So, as stored, you've got 100 bytes of regular stuff plus an 18-byte TOAST header. If you could trim 2 of the above-mentioned 4-byte fields out of the TOAST header, that would reduce the size of the main relation fork by almost 7%. If you could trim all 3 of them out, you'd save more than 10%. That's not nothing, and the benefits could be even larger for rows that contain multiple TOAST pointers. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
* Robert Haas (robertmh...@gmail.com) wrote: On Thu, Apr 23, 2015 at 11:24 PM, Tom Lane t...@sss.pgh.pa.us wrote: Having said all that, if we did try to fix it today, I'd imagine changing TOAST value identifiers to int64 and inventing a new TOAST pointer format for use when 32 bits isn't wide enough for the ID. But I think we're best advised to hold off doing that until the need becomes pressing. Just out of curiosity, has anyone thought about inventing a new TOAST pointer format on the grounds that our TOAST pointers are unreasonably large? I'd not thought about it, but sure sounds like a good idea from here. Would be particularly great if we were able to do this and increase the number of supported toast pointers and avoid having to go hunting for unused identifiers due to wrapping. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 24/04/15 05:24, Tom Lane wrote: Stephen Frost sfr...@snowman.net writes: * Bruce Momjian (br...@momjian.us) wrote: On Sun, Feb 1, 2015 at 03:54:03PM +0100, Álvaro Hernández Tortosa wrote: The problem here is that performance degrades exponentially, or worse. Speaking here from experience, we already tested this for a very similar case (table creation, where two oids are consumed from a global sequence when inserting to pg_class). Have a look at http://www.slideshare.net/nosys/billion-tables-project-nycpug-2013, slides 43-45. We tested there this scenario and shown that table creations per second dropped from 10K to a few per second and then to a few per day. In the graphs you can't even realize there were more tables been created. At around 8K tables from the theoretical limit of 4B oids consumed, the process basically stopped (doing more insertions). We don't report the maximum number of tables per database, or the maximum number of TOAST values. Agreed? For my 2c, this limitation is a surprise to users and therefore we should add documentation to point out that it exists, unless we're going to actually fix it (which is certainly what I'd prefer to see...). TBH, I've got very little enthusiasm for fixing this given the number of reports of trouble from the field, which so far as I recall is zero. Álvaro's case came up through intentionally trying to create an unreasonable number of tables, not from real usage. This thread likewise appears to contain lots of speculation and no reports of anyone hitting a problem in practice. It is certainly true that this was a very synthetic case. I envision, however, certain use cases where we may hit a very large number of tables: - Massive multitenancy - Aggressive partitioning - Massive multitenancy with aggressive partitioning - Software dynamically generated tables, like those created by ToroDB (https://github.com/torodb/torodb). In ToroDB we generate tables depending only on the input data, so we may end up having as many as required by the datasource. For example, a general purpose json datastore may generate several tables per document inserted. Certainly this is likely to become an issue at some point in the future, but I'm not finding it very compelling to worry about now. By the time it does become an issue, we may have additional considerations or use cases that should inform a solution; which seems to me to be a good argument not to try to fix it in advance of real problems. Perhaps, I understand this argument, and it makes sense. However, on the other side, given the long time it may take from patch to commit and then release version to companies finally using it in production, I'd rather try to fix it soon, as there are already reports and use cases that may hit it, rather than wait three years until it explodes in our faces. After all, 640Kb RAM is enough, right? So maybe 2B tables is not that far in the horizon. Who knows. Regards, Álvaro -- Álvaro Hernández Tortosa --- 8Kdata -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 4/24/15 7:11 PM, Álvaro Hernández Tortosa wrote: On 24/04/15 05:24, Tom Lane wrote: ... TBH, I've got very little enthusiasm for fixing this given the number of reports of trouble from the field, which so far as I recall is zero. Álvaro's case came up through intentionally trying to create an unreasonable number of tables, not from real usage. This thread likewise appears to contain lots of speculation and no reports of anyone hitting a problem in practice. It is certainly true that this was a very synthetic case. I envision, however, certain use cases where we may hit a very large number of tables: The original case has NOTHING to do with the number of tables and everything to do with the number of toasted values a table can have. If you have to toast 4B attributes in a single relation it will fail. In reality, if you get anywhere close to that things will fall apart due to OID conflicts. This case isn't nearly as insane as 4B tables. A table storing 10 text fields each of which is 2K would hit this limit with only 400M rows. If my math is right that's only 8TB; certainly not anything insane space-wise or rowcount-wise. Perhaps it's still not fixing, but I think it's definitely worth documenting. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
* Bruce Momjian (br...@momjian.us) wrote: On Sun, Feb 1, 2015 at 03:54:03PM +0100, Álvaro Hernández Tortosa wrote: The problem here is that performance degrades exponentially, or worse. Speaking here from experience, we already tested this for a very similar case (table creation, where two oids are consumed from a global sequence when inserting to pg_class). Have a look at http://www.slideshare.net/nosys/billion-tables-project-nycpug-2013, slides 43-45. We tested there this scenario and shown that table creations per second dropped from 10K to a few per second and then to a few per day. In the graphs you can't even realize there were more tables been created. At around 8K tables from the theoretical limit of 4B oids consumed, the process basically stopped (doing more insertions). I had a look at our FAQ about Postgres limitations and I don't see anything that needs changing: https://wiki.postgresql.org/wiki/FAQ#What_is_the_maximum_size_for_a_row.2C_a_table.2C_and_a_database.3F Maximum size for a database? unlimited (32 TB databases exist) Maximum size for a table? 32 TB Maximum size for a row? 400 GB Maximum size for a field? 1 GB Maximum number of rows in a table? unlimited Maximum number of columns in a table? 250-1600 depending on column types Maximum number of indexes on a table? unlimited We don't report the maximum number of tables per database, or the maximum number of TOAST values. Agreed? For my 2c, this limitation is a surprise to users and therefore we should add documentation to point out that it exists, unless we're going to actually fix it (which is certainly what I'd prefer to see...). As for the other discussion on the thread, having a per-table sequence would be far better as it'd reduce the wrap-around risk quite a bit and, further, we should be able to provide that without breaking the on-disk format. What would be really nice is a way to expand the size of the key when needed- in other words, instead of wrapping around, if we actually hit 4B toasted values in a table then stick a flag somewhere for the next toasted value that says this value is in the second toast table/fork and then go up to 4B on that one, etc. That allows us to support more than 4B toasted values and doesn't require searching for holes in the values assigned. If we end up with empty toast tables eventually, then allow reusing them. Perhaps vacuum can even be used to make a note somewhere saying this toast table is now empty and can be reused. In the end, I'd like to think we can do better here than having a hard limit at 4B when it comes to how many values over a few KB we can store. As mentioned, that isn't all that much these days. I'm not saying that my proposal or what's been proposed upthread is an answer, but I've certainly build PG systems which store over 4B rows and it's not hard to imagine cases where I might have wanted a toasted value for each of those rows. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
Stephen Frost sfr...@snowman.net writes: * Bruce Momjian (br...@momjian.us) wrote: On Sun, Feb 1, 2015 at 03:54:03PM +0100, Álvaro Hernández Tortosa wrote: The problem here is that performance degrades exponentially, or worse. Speaking here from experience, we already tested this for a very similar case (table creation, where two oids are consumed from a global sequence when inserting to pg_class). Have a look at http://www.slideshare.net/nosys/billion-tables-project-nycpug-2013, slides 43-45. We tested there this scenario and shown that table creations per second dropped from 10K to a few per second and then to a few per day. In the graphs you can't even realize there were more tables been created. At around 8K tables from the theoretical limit of 4B oids consumed, the process basically stopped (doing more insertions). We don't report the maximum number of tables per database, or the maximum number of TOAST values. Agreed? For my 2c, this limitation is a surprise to users and therefore we should add documentation to point out that it exists, unless we're going to actually fix it (which is certainly what I'd prefer to see...). TBH, I've got very little enthusiasm for fixing this given the number of reports of trouble from the field, which so far as I recall is zero. Álvaro's case came up through intentionally trying to create an unreasonable number of tables, not from real usage. This thread likewise appears to contain lots of speculation and no reports of anyone hitting a problem in practice. Certainly this is likely to become an issue at some point in the future, but I'm not finding it very compelling to worry about now. By the time it does become an issue, we may have additional considerations or use cases that should inform a solution; which seems to me to be a good argument not to try to fix it in advance of real problems. Perhaps, for example, we'd find that at the same time we ought to relax the 1GB limit on individual-value size; or perhaps not. Having said all that, if we did try to fix it today, I'd imagine changing TOAST value identifiers to int64 and inventing a new TOAST pointer format for use when 32 bits isn't wide enough for the ID. But I think we're best advised to hold off doing that until the need becomes pressing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 02/03/2015 03:44 AM, Jim Nasby wrote: [snip] The alternative could be some long LOB (HugeOBject?) using the equivalent to serial8 whereas regular LOBs would use serial4. Well, it depends on how we did this. We could (for example) add a field to pg_class that determines what type to use for toast pointers; OID, int, or bigint. That could then be taken into account in the *toast* functions. But as others have pointed out, we haven't even had any real complaints about toast using OIDs as being an issue until now, so I think it's premature to start messing with this. At most it's just something to keep in mind so we don't preclude doing this in the future. A patch creating those HOBs (Huge Objects) might well make sense *after* the sequence refactoring got merged. Removing the bottleneck due to the OID allocator for this use case will be definitively welcome (I don't dare to code that just yet, but here's hoping someone will step in O:-) BTW, regarding the size of what gets toasted; I've often thought it would be useful to allow a custom size limit on columns so that you could easily force data to be toasted if you knew you were very unlikely to access it. Basically, a cheap form of vertical partitioning. Hmmm alter column set storage external / set storage extended ? From http://www.postgresql.org/docs/9.4/static/sql-altertable.html : ALTER [ COLUMN ] column_name SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN } This would do what you described, right? HTH, / J.L. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 2/3/15 9:50 AM, David Steele wrote: On 2/3/15 10:01 AM, José Luis Tallón wrote: Hmmm alter column set storage external / set storage extended ? From http://www.postgresql.org/docs/9.4/static/sql-altertable.html : ALTER [ COLUMN ] column_name SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN } This would do what you described, right? EXTENDED is the default for most TOAST-able types and is still subject to TOAST_TUPLE_THRESHOLD which is normally 2K. EXTERNAL is the same but with no compression. See: http://www.postgresql.org/docs/9.4/static/storage-toast.html Right. I'd like to be able to set per-column TOAST_TUPLE_THRESHOLD. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 2/3/15 10:01 AM, José Luis Tallón wrote: Hmmm alter column set storage external / set storage extended ? From http://www.postgresql.org/docs/9.4/static/sql-altertable.html : ALTER [ COLUMN ] column_name SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN } This would do what you described, right? EXTENDED is the default for most TOAST-able types and is still subject to TOAST_TUPLE_THRESHOLD which is normally 2K. EXTERNAL is the same but with no compression. See: http://www.postgresql.org/docs/9.4/static/storage-toast.html -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 2/3/15 5:27 PM, Jim Nasby wrote: On 2/3/15 9:50 AM, David Steele wrote: EXTENDED is the default for most TOAST-able types and is still subject to TOAST_TUPLE_THRESHOLD which is normally 2K. EXTERNAL is the same but with no compression. See: http://www.postgresql.org/docs/9.4/static/storage-toast.html Right. I'd like to be able to set per-column TOAST_TUPLE_THRESHOLD. No argument there. There are some columns that I would prefer to always TOAST because even 2K can be very big for some use cases. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 2/2/15, José Luis Tallón jltal...@adv-solutions.net wrote: On 01/31/2015 12:25 AM, Jim Nasby wrote: [snip] It's a bit more complex than that. First, toast isn't limited to bytea; it holds for ALL varlena fields in a table that are allowed to store externally. Second, the limit is actually per-table: every table gets it's own toast table, and each toast table is limited to 4B unique OIDs. Third, the OID counter is actually global, but the code should handle conflicts by trying to get another OID. See toast_save_datum(), which calls GetNewOidWithIndex(). Now, the reality is that GetNewOidWithIndex() is going to keep incrementing the global OID counter until it finds an OID that isn't in the toast table. That means that if you actually get anywhere close to using 4B OIDs you're going to become extremely unhappy with the performance of toasting new data. Indeed .. I don't think it would be horrifically hard to change the way toast OIDs are assigned (I'm thinking we'd basically switch to creating a sequence for every toast table), but I don't think anyone's ever tried to push toast hard enough to hit this kind of limit. We did. The Billion Table Project, part2 (a.k.a. when does Postgres' OID allocator become a bottleneck) The allocator becomes essentially unusable at about 2.1B OIDs, where it performed very well at quite empty( 100M objects) levels. So yes, using one sequence per TOAST table should help. Combined with the new SequenceAMs / sequence implementation being proposed (specifically: one file for all sequences in a certain tablespace) this should scale much better. But it wouldn't be perfect, right? I mean if you had multiple deletion/insertions and pass 4B then the one sequence per TOAST table would still wrap [albeit more slowly], and performance start degrading the same way. And there would still be the hard 4B limit. Perhaps the foreign key to the TOAST table could be changed from oid (32 bits) to something else (64 bits) [as well the sequence] so that it never wraps? What do you think? And would a more aggressive change like this have a chance of being accepted into the code base? Thanks. -roger- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 02/02/2015 09:36 PM, Roger Pack wrote: On 2/2/15, José Luis Tallón jltal...@adv-solutions.net wrote: On 01/31/2015 12:25 AM, Jim Nasby wrote: [snip] It's a bit more complex than that. First, toast isn't limited to bytea; it holds for ALL varlena fields in a table that are allowed to store externally. Second, the limit is actually per-table: every table gets it's own toast table, and each toast table is limited to 4B unique OIDs. Third, the OID counter is actually global, but the code should handle conflicts by trying to get another OID. See toast_save_datum(), which calls GetNewOidWithIndex(). Now, the reality is that GetNewOidWithIndex() is going to keep incrementing the global OID counter until it finds an OID that isn't in the toast table. That means that if you actually get anywhere close to using 4B OIDs you're going to become extremely unhappy with the performance of toasting new data. Indeed .. I don't think it would be horrifically hard to change the way toast OIDs are assigned (I'm thinking we'd basically switch to creating a sequence for every toast table), but I don't think anyone's ever tried to push toast hard enough to hit this kind of limit. We did. The Billion Table Project, part2 (a.k.a. when does Postgres' OID allocator become a bottleneck) The allocator becomes essentially unusable at about 2.1B OIDs, where it performed very well at quite empty( 100M objects) levels. So yes, using one sequence per TOAST table should help. Combined with the new SequenceAMs / sequence implementation being proposed (specifically: one file for all sequences in a certain tablespace) this should scale much better. But it wouldn't be perfect, right? I mean if you had multiple deletion/insertions and pass 4B then the one sequence per TOAST table would still wrap [albeit more slowly], and performance start degrading the same way. And there would still be the hard 4B limit. Perhaps the foreign key to the TOAST table could be changed from oid (32 bits) to something else (64 bits) [as well the sequence] so that it never wraps? Hmm 2^32 times aprox. 2kB (as per usual heuristics, ~4 rows per heap page) is 8796093022208 (~9e13) bytes ... which results in 8192 1GB segments :O Looks like partitioning might be needed much sooner than that (if only for index efficiency reasons)... unless access is purely sequential. The problem with changing the id from 32 to 64 bits is that the storage *for everybody else* doubles, making the implementation slower for most though this might be actually not that important. The alternative could be some long LOB (HugeOBject?) using the equivalent to serial8 whereas regular LOBs would use serial4. Anybody actually reaching this limit out there? Regards, / J .L. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
Roger Pack rogerdpa...@gmail.com writes: On 2/2/15, José Luis Tallón jltal...@adv-solutions.net wrote: So yes, using one sequence per TOAST table should help. Combined with the new SequenceAMs / sequence implementation being proposed (specifically: one file for all sequences in a certain tablespace) this should scale much better. But it wouldn't be perfect, right? I mean if you had multiple deletion/insertions and pass 4B then the one sequence per TOAST table would still wrap [albeit more slowly], and performance start degrading the same way. And there would still be the hard 4B limit. Perhaps the foreign key to the TOAST table could be changed from oid (32 bits) to something else (64 bits) [as well the sequence] so that it never wraps? What do you think? And would a more aggressive change like this have a chance of being accepted into the code base? There has been some thought about this, but I have seen no, zero, reports of anyone actually running into problems *in practice* (as opposed to contrived cases like can we create a billion tables). So we probably aren't going to want to address it until it starts being a real problem. The reason it's not as significant as you might think is that small field values (less than a couple KB *after compression*) don't get pushed out to the TOAST table, so they don't consume OIDs. And large field values, like megabytes worth, aren't a problem either because you just aren't gonna have that many of them. (Simple arithmetic.) You could potentially get into trouble if you had a whole lot of entries that were just a little over the toasting threshold, because then you'd have a lot of OIDs consumed but still a manageable total amount of disk space. But that doesn't seem to be a very common usage pattern. Also, partitioning the table largely eliminates the problem because each partition will have its own TOAST table. I'm on record as saying that many people are far too quick to decide that they need partitioning; but once you get into the volume of data where 4B toast entries starts to look like a limitation, you will probably have other reasons to think that you need to partition. In short, this is something that's theoretically interesting but doesn't seem worth doing in practice --- yet anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 2/2/15 3:50 PM, José Luis Tallón wrote: Hmm 2^32 times aprox. 2kB (as per usual heuristics, ~4 rows per heap page) is 8796093022208 (~9e13) bytes ... which results in 8192 1GB segments :O Looks like partitioning might be needed much sooner than that (if only for index efficiency reasons)... unless access is purely sequential. The problem with changing the id from 32 to 64 bits is that the storage *for everybody else* doubles, making the implementation slower for most though this might be actually not that important. The alternative could be some long LOB (HugeOBject?) using the equivalent to serial8 whereas regular LOBs would use serial4. Well, it depends on how we did this. We could (for example) add a field to pg_class that determines what type to use for toast pointers; OID, int, or bigint. That could then be taken into account in the *toast* functions. But as others have pointed out, we haven't even had any real complaints about toast using OIDs as being an issue until now, so I think it's premature to start messing with this. At most it's just something to keep in mind so we don't preclude doing this in the future. BTW, regarding the size of what gets toasted; I've often thought it would be useful to allow a custom size limit on columns so that you could easily force data to be toasted if you knew you were very unlikely to access it. Basically, a cheap form of vertical partitioning. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 1/30/15, Jim Nasby jim.na...@bluetreble.com wrote: On 1/30/15 11:54 AM, Roger Pack wrote: On 1/29/15, Roger Pack rogerdpa...@gmail.com wrote: Hello. I see on this page a mention of basically a 4B row limit for tables that have BLOB's Oops I meant for BYTEA or TEXT columns, but it's possible the reasoning is the same... It only applies to large objects, not bytea or text. OK I think I figured out possibly why the wiki says this. I guess BYTEA entries 2KB will be autostored via TOAST, which uses an OID in its backend. So BYTEA has a same limitation. It appears that disabling TOAST is not an option [1]. So I guess if the number of BYTEA entries (in the sum all tables? partitioning doesn't help?) with size 2KB is 4 billion then there is actually no option there? If this occurred it might cause all sorts of things to break? [2] It's a bit more complex than that. First, toast isn't limited to bytea; it holds for ALL varlena fields in a table that are allowed to store externally. Second, the limit is actually per-table: every table gets it's own toast table, and each toast table is limited to 4B unique OIDs. Third, the OID counter is actually global, but the code should handle conflicts by trying to get another OID. See toast_save_datum(), which calls GetNewOidWithIndex(). Now, the reality is that GetNewOidWithIndex() is going to keep incrementing the global OID counter until it finds an OID that isn't in the toast table. That means that if you actually get anywhere close to using 4B OIDs you're going to become extremely unhappy with the performance of toasting new data. OK so system stability doesn't degrade per se when it wraps, good to know. So basically when it gets near 4B rows it may have to wrap that counter multiple times, and for each entry it's searching if it's already used, etc. So I guess partitioning tables for now is an acceptable work around, good to know. Thanks much for your response, good to know the details before we dive into postgres with our 8B row table with BYTEA's in it :) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 01/31/2015 12:25 AM, Jim Nasby wrote: [snip] It's a bit more complex than that. First, toast isn't limited to bytea; it holds for ALL varlena fields in a table that are allowed to store externally. Second, the limit is actually per-table: every table gets it's own toast table, and each toast table is limited to 4B unique OIDs. Third, the OID counter is actually global, but the code should handle conflicts by trying to get another OID. See toast_save_datum(), which calls GetNewOidWithIndex(). Now, the reality is that GetNewOidWithIndex() is going to keep incrementing the global OID counter until it finds an OID that isn't in the toast table. That means that if you actually get anywhere close to using 4B OIDs you're going to become extremely unhappy with the performance of toasting new data. Indeed .. I don't think it would be horrifically hard to change the way toast OIDs are assigned (I'm thinking we'd basically switch to creating a sequence for every toast table), but I don't think anyone's ever tried to push toast hard enough to hit this kind of limit. We did. The Billion Table Project, part2 (a.k.a. when does Postgres' OID allocator become a bottleneck) The allocator becomes essentially unusable at about 2.1B OIDs, where it performed very well at quite empty( 100M objects) levels. So yes, using one sequence per TOAST table should help. Combined with the new SequenceAMs / sequence implementation being proposed (specifically: one file for all sequences in a certain tablespace) this should scale much better. My 2c. Regards, / J.L. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 31/01/15 14:55, Roger Pack wrote: [...] Now, the reality is that GetNewOidWithIndex() is going to keep incrementing the global OID counter until it finds an OID that isn't in the toast table. That means that if you actually get anywhere close to using 4B OIDs you're going to become extremely unhappy with the performance of toasting new data. OK so system stability doesn't degrade per se when it wraps [since they all use that GetNewOid method or similar [?] good to know. So basically when it gets near 4B TOAST'ed rows it may have to wrap that counter and search for unused number, and for each number it's querying the TOAST table to see if it's already used, degrading performance. The problem here is that performance degrades exponentially, or worse. Speaking here from experience, we already tested this for a very similar case (table creation, where two oids are consumed from a global sequence when inserting to pg_class). Have a look at http://www.slideshare.net/nosys/billion-tables-project-nycpug-2013, slides 43-45. We tested there this scenario and shown that table creations per second dropped from 10K to a few per second and then to a few per day. In the graphs you can't even realize there were more tables been created. At around 8K tables from the theoretical limit of 4B oids consumed, the process basically stopped (doing more insertions). Hope that this information helps. Best regards, Álvaro -- Álvaro Hernández Tortosa --- 8Kdata -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
Oops forgot to forward to the list (suggestion/feature request to the list admin for the various pg lists: make the default reply to go to the list, not the sender, if at all possible). Response below: On 1/30/15, Jim Nasby jim.na...@bluetreble.com wrote: On 1/30/15 11:54 AM, Roger Pack wrote: On 1/29/15, Roger Pack rogerdpa...@gmail.com wrote: Hello. I see on this page a mention of basically a 4B row limit for tables that have BLOB's Oops I meant for BYTEA or TEXT columns, but it's possible the reasoning is the same... It only applies to large objects, not bytea or text. OK I think I figured out possibly why the wiki says this. I guess BYTEA entries 2KB will be autostored via TOAST, which uses an OID in its backend. So BYTEA has a same limitation. It appears that disabling TOAST is not an option [1]. So I guess if the number of BYTEA entries (in the sum all tables? partitioning doesn't help?) with size 2KB is 4 billion then there is actually no option there? If this occurred it might cause all sorts of things to break? [2] It's a bit more complex than that. First, toast isn't limited to bytea; it holds for ALL varlena fields in a table that are allowed to store externally. Second, the limit is actually per-table: every table gets it's own toast table, and each toast table is limited to 4B unique OIDs. Third, the OID counter is actually global, but the code should handle conflicts by trying to get another OID. See toast_save_datum(), which calls GetNewOidWithIndex(). Now, the reality is that GetNewOidWithIndex() is going to keep incrementing the global OID counter until it finds an OID that isn't in the toast table. That means that if you actually get anywhere close to using 4B OIDs you're going to become extremely unhappy with the performance of toasting new data. OK so system stability doesn't degrade per se when it wraps [since they all use that GetNewOid method or similar [?] good to know. So basically when it gets near 4B TOAST'ed rows it may have to wrap that counter and search for unused number, and for each number it's querying the TOAST table to see if it's already used, degrading performance. So I guess partitioning tables for now is an acceptable work around, good to know. Thanks much for your response, good to know the details before we dive into postgres with our 8B row table with BYTEA's in it :) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
On 1/30/15 11:54 AM, Roger Pack wrote: On 1/29/15, Roger Pack rogerdpa...@gmail.com wrote: Hello. I see on this page a mention of basically a 4B row limit for tables that have BLOB's Oops I meant for BYTEA or TEXT columns, but it's possible the reasoning is the same... It only applies to large objects, not bytea or text. OK I think I figured out possibly why the wiki says this. I guess BYTEA entries 2KB will be autostored via TOAST, which uses an OID in its backend. So BYTEA has a same limitation. It appears that disabling TOAST is not an option [1]. So I guess if the number of BYTEA entries (in the sum all tables? partitioning doesn't help?) with size 2KB is 4 billion then there is actually no option there? If this occurred it might cause all sorts of things to break? [2] It's a bit more complex than that. First, toast isn't limited to bytea; it holds for ALL varlena fields in a table that are allowed to store externally. Second, the limit is actually per-table: every table gets it's own toast table, and each toast table is limited to 4B unique OIDs. Third, the OID counter is actually global, but the code should handle conflicts by trying to get another OID. See toast_save_datum(), which calls GetNewOidWithIndex(). Now, the reality is that GetNewOidWithIndex() is going to keep incrementing the global OID counter until it finds an OID that isn't in the toast table. That means that if you actually get anywhere close to using 4B OIDs you're going to become extremely unhappy with the performance of toasting new data. I don't think it would be horrifically hard to change the way toast OIDs are assigned (I'm thinking we'd basically switch to creating a sequence for every toast table), but I don't think anyone's ever tried to push toast hard enough to hit this kind of limit. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers