Re: A space-efficient, user-friendly way to store categorical data

2018-02-12 Thread Andres Freund
Hi, On 2018-02-12 09:54:29 +1030, Andrew Dunstan wrote: > The idea is to have an append-only list of labels > which would not obey transactional semantics, and would thus help us > avoid the pitfalls of enums - there wouldn't be any rollback of an > addition. FWIW, I think we can resolve the

Re: A space-efficient, user-friendly way to store categorical data

2018-02-12 Thread Andrew Kane
They'd refer to separate enums. I originally thought an enum was a good comparison for this feature, but I'm no longer sure that it is. A text-based ordering would be desired rather than the label index. A better comparison may be a two-column lookup table: -- create CREATE TABLE cities (id

Re: A space-efficient, user-friendly way to store categorical data

2018-02-12 Thread Mark Dilger
> On Feb 12, 2018, at 6:35 PM, Tom Lane wrote: > > Andrew Kane writes: >> Thanks everyone for the feedback. The current enum implementation requires >> you to create a new type and add labels outside a transaction prior to an >> insert. > > Right ...

Re: A space-efficient, user-friendly way to store categorical data

2018-02-12 Thread Mark Dilger
> On Feb 12, 2018, at 5:08 PM, Andrew Kane wrote: > > Thanks everyone for the feedback. The current enum implementation requires > you to create a new type and add labels outside a transaction prior to an > insert. > > -- on table creation > CREATE TYPE city AS ENUM ();

Re: A space-efficient, user-friendly way to store categorical data

2018-02-12 Thread Tom Lane
Andrew Kane writes: > Thanks everyone for the feedback. The current enum implementation requires > you to create a new type and add labels outside a transaction prior to an > insert. Right ... > Since enums have a fixed number of labels, this type of feature may be >

Re: A space-efficient, user-friendly way to store categorical data

2018-02-12 Thread Andrew Kane
Thanks everyone for the feedback. The current enum implementation requires you to create a new type and add labels outside a transaction prior to an insert. -- on table creation CREATE TYPE city AS ENUM (); CREATE TABLE "users" ("city" city); -- on insert ALTER TYPE city ADD VALUE IF NOT EXISTS

Re: A space-efficient, user-friendly way to store categorical data

2018-02-12 Thread Mark Dilger
> On Feb 10, 2018, at 7:46 PM, Andrew Kane wrote: > > Hi, > > I'm hoping to get feedback on an idea for a new data type to allow for > efficient storage of text values while keeping reads and writes > user-friendly. Suppose you want to store categorical data like

Re: A space-efficient, user-friendly way to store categorical data

2018-02-12 Thread Joe Conway
On 02/11/2018 10:06 PM, Thomas Munro wrote: > On Mon, Feb 12, 2018 at 12:24 PM, Andrew Dunstan > wrote: >> On Mon, Feb 12, 2018 at 9:10 AM, Tom Lane wrote: >>> Andrew Kane writes: A better option could be a new

Re: A space-efficient, user-friendly way to store categorical data

2018-02-11 Thread Thomas Munro
On Mon, Feb 12, 2018 at 12:24 PM, Andrew Dunstan wrote: > On Mon, Feb 12, 2018 at 9:10 AM, Tom Lane wrote: >> Andrew Kane writes: >>> A better option could be a new "dynamic enum" type, which would have >>> similar

Re: A space-efficient, user-friendly way to store categorical data

2018-02-11 Thread Andrew Dunstan
On Mon, Feb 12, 2018 at 9:10 AM, Tom Lane wrote: > Andrew Kane writes: >> A better option could be a new "dynamic enum" type, which would have >> similar storage requirements as an enum, but instead of labels being >> declared ahead of time, they would

Re: A space-efficient, user-friendly way to store categorical data

2018-02-11 Thread Tom Lane
Andrew Kane writes: > A better option could be a new "dynamic enum" type, which would have > similar storage requirements as an enum, but instead of labels being > declared ahead of time, they would be added as data is inserted. You realize, of course, that it's possible to

A space-efficient, user-friendly way to store categorical data

2018-02-11 Thread Andrew Kane
Hi, I'm hoping to get feedback on an idea for a new data type to allow for efficient storage of text values while keeping reads and writes user-friendly. Suppose you want to store categorical data like current city for users. There will be a long list of cities, and many users will have the same