Re: [GENERAL] "Shared strings"-style table

2017-10-13 Thread Seamus Abshere
On Fri, Oct 13, 2017, at 03:16 PM, David G. Johnston wrote: > implement a "system-managed-enum" type with many of the same properties [...] > TOAST does involved compression but the input to > the compression algorithm is a single cell (row and column) in a table.​ > As noted above I consider the

Re: [GENERAL] "Shared strings"-style table

2017-10-13 Thread Thomas Kellerer
Seamus Abshere schrieb am 13.10.2017 um 18:43: On Fri, Oct 13, 2017 at 8:49 AM, Seamus Abshere wrote: Theoretically / blue sky, could there be a table or column type that transparently handles "shared strings" like this, reducing size on disk at the cost of lookup overhead for all queries? (I

Re: [GENERAL] "Shared strings"-style table

2017-10-13 Thread David G. Johnston
On Fri, Oct 13, 2017 at 9:29 AM, Seamus Abshere wrote: > > On Fri, Oct 13, 2017 at 8:49 AM, Seamus Abshere wrote > > > Theoretically / blue sky, could there be a table or column type that > > > transparently handles "shared strings" like this, reducing size on disk > > > at

Re: [GENERAL] "Shared strings"-style table

2017-10-13 Thread Peter J. Holzer
On 2017-10-13 12:49:21 -0300, Seamus Abshere wrote: > In the spreadsheet world, there is this concept of "shared strings," a > simple way of compressing spreadsheets when the data is duplicated in > many cells. > > In my database, I have a table with >200 million rows and >300 columns > (all the

Re: [GENERAL] "Shared strings"-style table

2017-10-13 Thread Melvin Davidson
On Fri, Oct 13, 2017 at 12:52 PM, Melvin Davidson wrote: > > > On Fri, Oct 13, 2017 at 12:43 PM, Seamus Abshere > wrote: > >> > > On Fri, Oct 13, 2017 at 8:49 AM, Seamus Abshere wrote: >> > >> Theoretically / blue sky, could there be a table or column

Re: [GENERAL] "Shared strings"-style table

2017-10-13 Thread Melvin Davidson
On Fri, Oct 13, 2017 at 12:43 PM, Seamus Abshere wrote: > > > On Fri, Oct 13, 2017 at 8:49 AM, Seamus Abshere wrote: > > >> Theoretically / blue sky, could there be a table or column type that > > >> transparently handles "shared strings" like this, reducing size on > disk >

Re: [GENERAL] "Shared strings"-style table

2017-10-13 Thread Seamus Abshere
> > On Fri, Oct 13, 2017 at 8:49 AM, Seamus Abshere wrote: > >> Theoretically / blue sky, could there be a table or column type that > >> transparently handles "shared strings" like this, reducing size on disk > >> at the cost of lookup overhead for all queries? > >> (I guess maybe it's like

Re: [GENERAL] "Shared strings"-style table

2017-10-13 Thread Melvin Davidson
On Fri, Oct 13, 2017 at 12:12 PM, David G. Johnston < david.g.johns...@gmail.com> wrote: > On Fri, Oct 13, 2017 at 8:49 AM, Seamus Abshere > wrote: > >> Theoretically / blue sky, could there be a table or column type that >> transparently handles "shared strings" like this,

Re: [GENERAL] "Shared strings"-style table

2017-10-13 Thread Seamus Abshere
> On Fri, Oct 13, 2017 at 8:49 AM, Seamus Abshere wrote > > Theoretically / blue sky, could there be a table or column type that > > transparently handles "shared strings" like this, reducing size on disk > > at the cost of lookup overhead for all queries? > > (I guess maybe it's like TOAST, but

Re: [GENERAL] "Shared strings"-style table

2017-10-13 Thread David G. Johnston
On Fri, Oct 13, 2017 at 8:49 AM, Seamus Abshere wrote: > Theoretically / blue sky, could there be a table or column type that > transparently handles "shared strings" like this, reducing size on disk > at the cost of lookup overhead for all queries? > > (I guess maybe it's

Re: [GENERAL] "Shared strings"-style table

2017-10-13 Thread Rob Sargent
On 10/13/2017 09:49 AM, Seamus Abshere wrote: hey, In the spreadsheet world, there is this concept of "shared strings," a simple way of compressing spreadsheets when the data is duplicated in many cells. In my database, I have a table with >200 million rows and >300 columns (all the

[GENERAL] "Shared strings"-style table

2017-10-13 Thread Seamus Abshere
hey, In the spreadsheet world, there is this concept of "shared strings," a simple way of compressing spreadsheets when the data is duplicated in many cells. In my database, I have a table with >200 million rows and >300 columns (all the households in the United States). For clarity of