Re: [HACKERS] collations in shared catalogs?

2015-05-18 Thread Andres Freund
On 2015-05-18 19:59:29 -0400, Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  On 2015-05-18 19:23:59 -0400, Tom Lane wrote:
  Hm, just forcing a collation and restricting the input to ascii should
  work, right?
 
 I think that's fragile as can be.  Is there a *really really* good
 argument why these things shouldn't be subject to identifier length
 restrictions?

It's maybe not absolutely strictly necessary. In fact in earlier
versions of the patch it was name. But replication solutions like bdr,
slony, whatever will have to store a bunch of values identifying a node
in there. And that's much easier if you're not constrained by 63 chars.

  While I'm looking at it, why in the world have roident and not just a
  standard system OID column?  This catalog seems willfully ignorant of
  Postgres conventions.
 
  There's a comment:
   * Needs to fit into an uint16, so we don't waste too much space in WAL
   * records. For this reason we don't use a normal Oid column here, since
   * we need to handle allocation of new values manually.
 
 If it needs to fit into uint16, why not make it smallint?  The declaration
 seems 100% misleading if it's not an OID.

smallint has a smaller range. I mean, we could use a smallint and just
store unsigned values nonetheless. But that'd be somewhat ugly, although
not without precedent (pg_class.relpages).  There'll only ever be very
few rows in pg_replication_origin, so the wideness itself doesn't
matter.

One reason for leaving it a oid instead of a more fitting type is that
it'll make it much smother to increase the limit to full 32bit -
there'll be no user level change.

 Moreover, the catalog infrastructure is failing to help you make sure
 the values are unique.

Not sure what you mean? There's both a unique key and locking in place
to make sure that's not violated.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-05-18 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On 2015-05-18 23:22:33 -0400, Tom Lane wrote:
 I'm inclined to think I should revert b82a7be603f1811a and instead make
 the seclabel provider columns use text_pattern_ops.  That would fix
 their collation problem with less of a backwards compatibility hazard.

 Sounds good to me. Are you doing that or should I?

I can deal with it, but not till mañana.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-05-18 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On 2015-05-18 19:59:29 -0400, Tom Lane wrote:
 I think that's fragile as can be.

 Hm. I think actually just forcing a collation would bring this on-par
 with name, right? We don't have any guarantees about the contents of
 e.g. pg_database.datname being meaningful in another database with a
 different encoding. In fact even the current database may have a name
 that's in a wrong encoding.

Oh, wait a minute.  I just noticed that you have
pg_replication_origin_roname_index defined to use varchar_pattern_ops.
Now, this is mildly broken: it should be text_pattern_ops.  But as far as
I can see offhand, that eliminates the collation dependency for the index.
The comparison rule is memcmp() which is not collation sensitive.

I'm inclined to think I should revert b82a7be603f1811a and instead make
the seclabel provider columns use text_pattern_ops.  That would fix
their collation problem with less of a backwards compatibility hazard.

 I'm right now toying with the idea of defining 'varname' as a text
 equivalent that always has a C type collation, and no length
 limitation.

That doesn't really address the encoding problem, so I'm not sure it
advances the state of the art particularly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-05-18 Thread Andres Freund
On 2015-05-18 23:22:33 -0400, Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  On 2015-05-18 19:59:29 -0400, Tom Lane wrote:
  I think that's fragile as can be.

  Hm. I think actually just forcing a collation would bring this on-par
  with name, right? We don't have any guarantees about the contents of
  e.g. pg_database.datname being meaningful in another database with a
  different encoding. In fact even the current database may have a name
  that's in a wrong encoding.

 Oh, wait a minute.  I just noticed that you have
 pg_replication_origin_roname_index defined to use varchar_pattern_ops.
 Now, this is mildly broken: it should be text_pattern_ops.  But as far as
 I can see offhand, that eliminates the collation dependency for the index.
 The comparison rule is memcmp() which is not collation sensitive.

Hah. Right. Forgot about that. Oh Brain, where art thou.

 I'm inclined to think I should revert b82a7be603f1811a and instead make
 the seclabel provider columns use text_pattern_ops.  That would fix
 their collation problem with less of a backwards compatibility hazard.

Sounds good to me. Are you doing that or should I?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-05-18 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2015-02-25 12:08:32 -0500, Tom Lane wrote:
 Andrew Gierth and...@tao11.riddles.org.uk writes:
 So while helping someone with an unrelated issue, I did a quick query to
 look for collation-dependent indexes, and was rather shocked to find
 that not only are there two such in the system catalogs, both set to
 default collation, but that one of them is in a _shared_ catalog
 (pg_shseclabel).
 How did that happen? And how could it possibly work?

 It probably doesn't, and the reason nobody has noticed is that the
 security label stuff has fewer users than I have fingers (and those
 people aren't using provider names that would cause anything interesting
 to happen).
 
 The most obvious fix is to change provider to a NAME column.

 Yea. I'm not sure why that wasn't done initially.

OK, now I'm on the warpath, because I went to fix this and discovered
that since that discussion, somebody named Freund committed yet another
shared catalog with a collation-dependent index.  This time, at least,
we can fix it *before* it gets into the wild.

Is it okay to change pg_replication_origin.roname to type name,
and if not what do you want to do instead?

While I'm looking at it, why in the world have roident and not just a
standard system OID column?  This catalog seems willfully ignorant of
Postgres conventions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-05-18 Thread Stephen Frost
* Andres Freund (and...@anarazel.de) wrote:
 On 2015-05-18 19:59:29 -0400, Tom Lane wrote:
  Andres Freund and...@anarazel.de writes:
   Hm, just forcing a collation and restricting the input to ascii should
   work, right?
  
  I think that's fragile as can be.
 
 Hm. I think actually just forcing a collation would bring this on-par
 with name, right? We don't have any guarantees about the contents of
 e.g. pg_database.datname being meaningful in another database with a
 different encoding. In fact even the current database may have a name
 that's in a wrong encoding.
 
 I'm right now toying with the idea of defining 'varname' as a text
 equivalent that always has a C type collation, and no length
 limitation. That'd generally imo be a good thing to have. A bunch of
 places really don't need the fixed width type and using a variable
 length type will save space. It'll also be a miniscule start twoards
 allowing longer identifiers...

Yes, please?

Would be great to finally make that happen...

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] collations in shared catalogs?

2015-05-18 Thread Andres Freund
On 2015-05-18 19:23:59 -0400, Tom Lane wrote:
 OK, now I'm on the warpath, because I went to fix this and discovered
 that since that discussion, somebody named Freund committed yet another
 shared catalog with a collation-dependent index.  This time, at least,
 we can fix it *before* it gets into the wild.

Hrmpf, good point.

 Is it okay to change pg_replication_origin.roname to type name,
 and if not what do you want to do instead?

It was turned into text after it initially was name, because of length
concerns.

Hm, just forcing a collation and restricting the input to ascii should
work, right? What I'm wondering is how we easily can do the collation
forcing part. The best seems to be to force the collation on the column
itself. We could add BKI_COLLATION(). Or we could invent a alias
'systext' or something that's intended to be used in catalogs?

 While I'm looking at it, why in the world have roident and not just a
 standard system OID column?  This catalog seems willfully ignorant of
 Postgres conventions.

There's a comment:
 * Needs to fit into an uint16, so we don't waste too much space in WAL
 * records. For this reason we don't use a normal Oid column here, since
 * we need to handle allocation of new values manually.

I mean it could use the standard oid, but given it's allocated
differently...


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-05-18 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On 2015-05-18 19:23:59 -0400, Tom Lane wrote:
 Is it okay to change pg_replication_origin.roname to type name,
 and if not what do you want to do instead?

 It was turned into text after it initially was name, because of length
 concerns.

 Hm, just forcing a collation and restricting the input to ascii should
 work, right?

I think that's fragile as can be.  Is there a *really really* good
argument why these things shouldn't be subject to identifier length
restrictions?

 While I'm looking at it, why in the world have roident and not just a
 standard system OID column?  This catalog seems willfully ignorant of
 Postgres conventions.

 There's a comment:
* Needs to fit into an uint16, so we don't waste too much space in WAL
* records. For this reason we don't use a normal Oid column here, since
* we need to handle allocation of new values manually.

If it needs to fit into uint16, why not make it smallint?  The declaration
seems 100% misleading if it's not an OID.  Moreover, the catalog
infrastructure is failing to help you make sure the values are unique.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-05-18 Thread Andres Freund
On 2015-05-18 20:19:29 -0400, Tom Lane wrote:
 Many people rely on UUIDs being impervious to chance collisions, so
 it's not clear to me why uniqueness within 63 characters is unachievable.
 Even more, if you can't do it in 63, what makes you think that 100 is
 better?

Well UUIDs are also hard to manage because they're pretty much bare of
any meaning. It's much easier to understand
'slony:{node=nodename,role=master,id=someid}' or similar than
'slony:cc70ac60-fdbd-11e4-b939-0800200c9a66'.

 Also, is a length limit really more onerous than the ASCII-only
 restriction you proposed?  (As an ASCII-only kind of guy, it wouldn't
 bother me any; but I suspect much of the world would beg to differ.)

I don't think anybody is going to be too concerned about node names or
something similar being ascii only. There's already a more restrictive
naming policy in place for replication slots...

 If you had both 1 and 1 + 2^20 in there, the existing unique index
 would not complain, but in practice those are duplicate entries, no?
 If you make the column smallint such a case would be physically
 impossible.

There's an error check in place (ERROR, PROGRAM_LIMIT_EXCEEDED)
preventing it when creating a replication origin.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-05-18 Thread Andres Freund
On 2015-05-18 21:20:29 -0400, Stephen Frost wrote:
 * Andres Freund (and...@anarazel.de) wrote:
  I'm right now toying with the idea of defining 'varname' as a text
  equivalent that always has a C type collation, and no length
  limitation. That'd generally imo be a good thing to have. A bunch of
  places really don't need the fixed width type and using a variable
  length type will save space. It'll also be a miniscule start twoards
  allowing longer identifiers...
 
 Yes, please?
 
 Would be great to finally make that happen...

Uh, this would bring us there maybe 15% of the way?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-05-18 Thread Stephen Frost
* Andres Freund (and...@anarazel.de) wrote:
 On 2015-05-18 21:20:29 -0400, Stephen Frost wrote:
  * Andres Freund (and...@anarazel.de) wrote:
   I'm right now toying with the idea of defining 'varname' as a text
   equivalent that always has a C type collation, and no length
   limitation. That'd generally imo be a good thing to have. A bunch of
   places really don't need the fixed width type and using a variable
   length type will save space. It'll also be a miniscule start twoards
   allowing longer identifiers...
  
  Yes, please?
  
  Would be great to finally make that happen...
 
 Uh, this would bring us there maybe 15% of the way?

I can hope?

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] collations in shared catalogs?

2015-05-18 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On 2015-05-18 19:59:29 -0400, Tom Lane wrote:
 I think that's fragile as can be.  Is there a *really really* good
 argument why these things shouldn't be subject to identifier length
 restrictions?

 It's maybe not absolutely strictly necessary. In fact in earlier
 versions of the patch it was name. But replication solutions like bdr,
 slony, whatever will have to store a bunch of values identifying a node
 in there. And that's much easier if you're not constrained by 63 chars.

Many people rely on UUIDs being impervious to chance collisions, so
it's not clear to me why uniqueness within 63 characters is unachievable.
Even more, if you can't do it in 63, what makes you think that 100 is
better?

Also, is a length limit really more onerous than the ASCII-only
restriction you proposed?  (As an ASCII-only kind of guy, it wouldn't
bother me any; but I suspect much of the world would beg to differ.)

 Moreover, the catalog infrastructure is failing to help you make sure
 the values are unique.

 Not sure what you mean? There's both a unique key and locking in place
 to make sure that's not violated.

If you had both 1 and 1 + 2^20 in there, the existing unique index would
not complain, but in practice those are duplicate entries, no?  If you
make the column smallint such a case would be physically impossible.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-05-18 Thread Andres Freund
On 2015-05-18 19:59:29 -0400, Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  Hm, just forcing a collation and restricting the input to ascii should
  work, right?
 
 I think that's fragile as can be.

Hm. I think actually just forcing a collation would bring this on-par
with name, right? We don't have any guarantees about the contents of
e.g. pg_database.datname being meaningful in another database with a
different encoding. In fact even the current database may have a name
that's in a wrong encoding.

I'm right now toying with the idea of defining 'varname' as a text
equivalent that always has a C type collation, and no length
limitation. That'd generally imo be a good thing to have. A bunch of
places really don't need the fixed width type and using a variable
length type will save space. It'll also be a miniscule start twoards
allowing longer identifiers...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-05-18 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 It's maybe not absolutely strictly necessary. In fact in earlier
 versions of the patch it was name. But replication solutions like bdr,
 slony, whatever will have to store a bunch of values identifying a node
 in there. And that's much easier if you're not constrained by 63 chars.

That's silly. We (third-party tools) already have to work around lots 
of things constrained by namedatalen.

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201505182138
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAlValBYACgkQvJuQZxSWSsiODwCfRDTNsEHKsp7rbK24lT4lApwa
X1sAn0QL33wJyn/AWT2aLL9u+Ybt+aNb
=VjvO
-END PGP SIGNATURE-




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-04-30 Thread Bruce Momjian
On Thu, Apr 30, 2015 at 08:16:09AM -0700, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  On 2015-02-25 12:08:32 -0500, Tom Lane wrote:
  The most obvious fix is to change provider to a NAME column.
 
  Where are we on this?
 
 Not done yet, but we should make a point of making that fix before 9.5.
 Please add it to the open items page for 9.5.
 
 I am not sure there's anything useful to be done about this in the back
 branches.

Done.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-04-30 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On 2015-02-25 12:08:32 -0500, Tom Lane wrote:
 The most obvious fix is to change provider to a NAME column.

 Where are we on this?

Not done yet, but we should make a point of making that fix before 9.5.
Please add it to the open items page for 9.5.

I am not sure there's anything useful to be done about this in the back
branches.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-04-30 Thread Bruce Momjian
On Wed, Feb 25, 2015 at 10:19:45PM +0100, Andres Freund wrote:
 On 2015-02-25 15:59:55 -0500, Tom Lane wrote:
  Andres Freund and...@2ndquadrant.com writes:
   On 2015-02-25 12:08:32 -0500, Tom Lane wrote:
   The most obvious fix is to change provider to a NAME column.
  
   Yea. I'm not sure why that wasn't done initially. I can't really see the
   length be an issue. How about we add an error check enforcing ascii,
   that'll work in the back branches?
  
  Nope, that won't help much at all.  C vs en_US for instance is different
  sort orders even with all-ASCII data.
 
 Ick, yes. The restriction to a charset that's encodable in all server
 encodings should be there additionally, but it's not sufficient :(
 
  Basically you're screwed if you've got different collations in different
  databases and you put anything into pg_shseclabel ...
 
 Hrmpf.

Where are we on this?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-03-04 Thread Robert Haas
On Wed, Feb 25, 2015 at 7:54 PM, David Steele da...@pgmasters.net wrote:
 +1 on 128/256 character names.

 /me runs and hides.

 /stands brazenly in the open and volunteers to try it if I don't get
 clobbered within seconds.

I think the question is whether making lots of rows in system catalogs
better is going to have undesirable effects on (a) the size of our
initial on-disk format (i.e. how big an empty database is), (b) the
amount of memory consumed by the syscache and relcaches on workloads
that touch lots of tables/functions/whatever, or (c) CPU consumption
mostly as a result of more cache line accesses for the same operation.
If you can prove those effects are minimal, that'd be a good place to
start.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-03-04 Thread David Steele
Hi Robert,

On 3/4/15 10:14 AM, Robert Haas wrote:
 On Wed, Feb 25, 2015 at 7:54 PM, David Steele da...@pgmasters.net wrote:
 +1 on 128/256 character names.

 /me runs and hides.

 /stands brazenly in the open and volunteers to try it if I don't get
 clobbered within seconds.
 
 I think the question is whether making lots of rows in system catalogs
 better is going to have undesirable effects on (a) the size of our
 initial on-disk format (i.e. how big an empty database is), (b) the
 amount of memory consumed by the syscache and relcaches on workloads
 that touch lots of tables/functions/whatever, or (c) CPU consumption
 mostly as a result of more cache line accesses for the same operation.
 If you can prove those effects are minimal, that'd be a good place to
 start.

Thanks, that's encouraging.  I've already compiled with NAMEDATALEN=256
and verified that the only failing tests are the ones making sure that
identifier lengths are truncated or fail appropriately when they are 
63.  I'm sure lots of people have done that before and gotten the same
result.

I'm currently investigating the issues that you've identified above
since they constitute the real problem with increasing NAMEDATALEN.
Once I have some answers I'll send a proposal to hackers.

-- 
- David Steele
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] collations in shared catalogs?

2015-02-25 Thread Andrew Gierth
 Tom == Tom Lane t...@sss.pgh.pa.us writes:

  So while helping someone with an unrelated issue, I did a quick
  query to look for collation-dependent indexes, and was rather
  shocked to find that not only are there two such in the system
  catalogs, both set to default collation, but that one of them is
  in a _shared_ catalog (pg_shseclabel).

  How did that happen? And how could it possibly work?

 Tom It probably doesn't, and the reason nobody has noticed is that the
 Tom security label stuff has fewer users than I have fingers (and
 Tom those people aren't using provider names that would cause anything
 Tom interesting to happen).

Or possibly not mixing locales between databases.

 Tom The most obvious fix is to change provider to a NAME column.

 Tom What was the other case?  We might want to add a regression test
 Tom to check for collation-dependent system indexes ...

pg_seclabel (also provider).

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-02-25 Thread Tom Lane
Andrew Gierth and...@tao11.riddles.org.uk writes:
 So while helping someone with an unrelated issue, I did a quick query to
 look for collation-dependent indexes, and was rather shocked to find
 that not only are there two such in the system catalogs, both set to
 default collation, but that one of them is in a _shared_ catalog
 (pg_shseclabel).

 How did that happen? And how could it possibly work?

It probably doesn't, and the reason nobody has noticed is that the
security label stuff has fewer users than I have fingers (and those
people aren't using provider names that would cause anything interesting
to happen).

The most obvious fix is to change provider to a NAME column.

What was the other case?  We might want to add a regression test to
check for collation-dependent system indexes ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-02-25 Thread Alvaro Herrera
Tom Lane wrote:
 Andrew Gierth and...@tao11.riddles.org.uk writes:
  So while helping someone with an unrelated issue, I did a quick query to
  look for collation-dependent indexes, and was rather shocked to find
  that not only are there two such in the system catalogs, both set to
  default collation, but that one of them is in a _shared_ catalog
  (pg_shseclabel).
 
  How did that happen? And how could it possibly work?
 
 It probably doesn't, and the reason nobody has noticed is that the
 security label stuff has fewer users than I have fingers (and those
 people aren't using provider names that would cause anything interesting
 to happen).

The BDR code has recently started using security labels as a place to
store table-specific data.  That widens its use a fair bit ... and most
likely, other extensions will also start using them as soon as they
realize that it can be used for stuff other than actual security labels.

(FWIW we shouldn't have called these security labels but just
generically labels or something like that.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-02-25 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Tom Lane wrote:
 Andrew Gierth and...@tao11.riddles.org.uk writes:
 How did that happen? And how could it possibly work?

 It probably doesn't, and the reason nobody has noticed is that the
 security label stuff has fewer users than I have fingers (and those
 people aren't using provider names that would cause anything interesting
 to happen).

 The BDR code has recently started using security labels as a place to
 store table-specific data.  That widens its use a fair bit ... and most
 likely, other extensions will also start using them as soon as they
 realize that it can be used for stuff other than actual security labels.

Yeah?  Would they be OK with redefining the provider field as name,
or would the length limit be an issue?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-02-25 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  Tom Lane wrote:
  Andrew Gierth and...@tao11.riddles.org.uk writes:
  How did that happen? And how could it possibly work?
 
  It probably doesn't, and the reason nobody has noticed is that the
  security label stuff has fewer users than I have fingers (and those
  people aren't using provider names that would cause anything interesting
  to happen).
 
  The BDR code has recently started using security labels as a place to
  store table-specific data.  That widens its use a fair bit ... and most
  likely, other extensions will also start using them as soon as they
  realize that it can be used for stuff other than actual security labels.
 
 Yeah?  Would they be OK with redefining the provider field as name,
 or would the length limit be an issue?

Nah, it's fine.  The provider name used there is bdr.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-02-25 Thread Andres Freund
On 2015-02-25 12:08:32 -0500, Tom Lane wrote:
 Andrew Gierth and...@tao11.riddles.org.uk writes:
  So while helping someone with an unrelated issue, I did a quick query to
  look for collation-dependent indexes, and was rather shocked to find
  that not only are there two such in the system catalogs, both set to
  default collation, but that one of them is in a _shared_ catalog
  (pg_shseclabel).
 
  How did that happen? And how could it possibly work?
 
 It probably doesn't, and the reason nobody has noticed is that the
 security label stuff has fewer users than I have fingers (and those
 people aren't using provider names that would cause anything interesting
 to happen).
 
 The most obvious fix is to change provider to a NAME column.

Yea. I'm not sure why that wasn't done initially. I can't really see the
length be an issue. How about we add an error check enforcing ascii,
that'll work in the back branches?

Generally it's not the greatest idea to have non-ascii stuff in shared
catalogs...

 What was the other case?  We might want to add a regression test to
 check for collation-dependent system indexes ...

+1

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-02-25 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2015-02-25 12:08:32 -0500, Tom Lane wrote:
 The most obvious fix is to change provider to a NAME column.

 Yea. I'm not sure why that wasn't done initially. I can't really see the
 length be an issue. How about we add an error check enforcing ascii,
 that'll work in the back branches?

Nope, that won't help much at all.  C vs en_US for instance is different
sort orders even with all-ASCII data.

Basically you're screwed if you've got different collations in different
databases and you put anything into pg_shseclabel ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-02-25 Thread Andres Freund
On 2015-02-25 15:59:55 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2015-02-25 12:08:32 -0500, Tom Lane wrote:
  The most obvious fix is to change provider to a NAME column.
 
  Yea. I'm not sure why that wasn't done initially. I can't really see the
  length be an issue. How about we add an error check enforcing ascii,
  that'll work in the back branches?
 
 Nope, that won't help much at all.  C vs en_US for instance is different
 sort orders even with all-ASCII data.

Ick, yes. The restriction to a charset that's encodable in all server
encodings should be there additionally, but it's not sufficient :(

 Basically you're screwed if you've got different collations in different
 databases and you put anything into pg_shseclabel ...

Hrmpf.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collations in shared catalogs?

2015-02-25 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
 Tom Lane wrote:
  Alvaro Herrera alvhe...@2ndquadrant.com writes:
   Tom Lane wrote:
   Andrew Gierth and...@tao11.riddles.org.uk writes:
   How did that happen? And how could it possibly work?
  
   It probably doesn't, and the reason nobody has noticed is that the
   security label stuff has fewer users than I have fingers (and those
   people aren't using provider names that would cause anything interesting
   to happen).
  
   The BDR code has recently started using security labels as a place to
   store table-specific data.  That widens its use a fair bit ... and most
   likely, other extensions will also start using them as soon as they
   realize that it can be used for stuff other than actual security labels.
  
  Yeah?  Would they be OK with redefining the provider field as name,
  or would the length limit be an issue?
 
 Nah, it's fine.  The provider name used there is bdr.

Agreed, the provider field should be fine as a name field.  Not that I
expect it to be an issue, but I'd definitely like to keep the label
field as text as those can definitely be longer (the very simply example
included in the security label docs is over half the length of a name
field already..).  Now if we increased name to 128 characters...

/me runs and hides.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] collations in shared catalogs?

2015-02-25 Thread David Steele
On 2/25/15 5:47 PM, Stephen Frost wrote:
 * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
 Tom Lane wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Tom Lane wrote:
 Andrew Gierth and...@tao11.riddles.org.uk writes:
 How did that happen? And how could it possibly work?

 It probably doesn't, and the reason nobody has noticed is that the
 security label stuff has fewer users than I have fingers (and those
 people aren't using provider names that would cause anything interesting
 to happen).

 The BDR code has recently started using security labels as a place to
 store table-specific data.  That widens its use a fair bit ... and most
 likely, other extensions will also start using them as soon as they
 realize that it can be used for stuff other than actual security labels.

 Yeah?  Would they be OK with redefining the provider field as name,
 or would the length limit be an issue?

 Nah, it's fine.  The provider name used there is bdr.
 
 Agreed, the provider field should be fine as a name field.  Not that I
 expect it to be an issue, but I'd definitely like to keep the label
 field as text as those can definitely be longer (the very simply example
 included in the security label docs is over half the length of a name
 field already..).  Now if we increased name to 128 characters...

+1 on 128/256 character names.

 /me runs and hides.

/stands brazenly in the open and volunteers to try it if I don't get
clobbered within seconds.

-- 
- David Steele
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature