Re: [HACKERS] bigint vs txid user confusion

2016-12-21 Thread Craig Ringer
On 21 December 2016 at 14:06, Jim Nasby  wrote:
> On 12/20/16 10:20 PM, Craig Ringer wrote:
>>
>> Tools look at pg_class.relfrozenxid and pg_databse.datfrozenxid more
>> than probably anything else, so making changes that ignores them is
>> pretty pointless.
>
>
> Except the only useful way I know of to access *frozenxid is using age(),
> and even that is a royal PITA when the xid is a special xid. So I'd argue
> that we should effectively remove xid from user's view. Even if we don't
> want to bloat pg_class by 4 bytes, we should just make xid even more opaque
> than it is today and tell users to just cast it to bigxid.

That's a good point. Keep it as 'xid' to avoid bloating pg_class.
age(...) continues to make sense there. Change everything else to
bigxid.

If we decide we need comparisons of xid >|<|= bigxid, provide a cast
of xid to bigxid that assumes the current epoch maybe. I'm not super
fond of that though, since users can still write

   XID '1234'

Unsure about how to handle that.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bigint vs txid user confusion

2016-12-20 Thread Jim Nasby

On 12/20/16 10:20 PM, Craig Ringer wrote:

Tools look at pg_class.relfrozenxid and pg_databse.datfrozenxid more
than probably anything else, so making changes that ignores them is
pretty pointless.


Except the only useful way I know of to access *frozenxid is using 
age(), and even that is a royal PITA when the xid is a special xid. So 
I'd argue that we should effectively remove xid from user's view. Even 
if we don't want to bloat pg_class by 4 bytes, we should just make xid 
even more opaque than it is today and tell users to just cast it to bigxid.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bigint vs txid user confusion

2016-12-20 Thread Craig Ringer
On 17 December 2016 at 00:13, Robert Haas  wrote:
> On Thu, Dec 15, 2016 at 3:02 AM, Craig Ringer  wrote:
>> I really wish we could just change the pg_stat_activity and
>> pg_stat_replication xid fields to be epoch qualified in a 64-bit wide
>> 'fullxid' type, or similar.
>
> I think that approach is worth considering.

I'm worried about how many monitoring tools it'd break.

Arguably most or all will already be broken and not know it, though.
Those that aren't will be using age(xid), which we can support just
fine for the new type too, so they won't notice.

Ideally I'd just like to widen 'xid' to 64-bits. This would upset
clients that use binary mode and "know" it's a 32-bit type on the
wire, though, so I'm not sure it's a good idea even though it'd be
nicer in pretty much every other way.

So the idea then is to add a new bigxid type, a 64-bit wide
epoch-extended xid, and replace _all_ appearances of 'xid' with it in
catalogs, views, etc, such that 'xid' is entirely deprecated.

Rather than convert 64-bit extended XIDs to 32-bit for internal
comparisons/functions/operators we'll just epoch-extend our 32-bit
xids, getting rid of the need to handle any sort of "too old" concept
in most cases.

The return type of txid_current() should probably change to the new
bitxid type. This'll upset apps that expect to do maths on it since
the new bigxid type won't have many operators, but since most (all?)
of that maths will be wrong I don't think that's a bad thing. Banging
in a ::bigint will quickfix so adaptation is easy, and it'll highlight
incorrect uses (many) of the call. The type is on-wire compatible with
the current bigint return type until you hit epoch 2^31 in which case
you have bigger things to worry about, like pending epoch wraparound.

HOWEVER, if we're going to really remove 'xid' from user view, it
should vanish from pg_database and pg_class too. That's a LOT more
intrusive, and widens pg_class by 4 bytes per row for minimal gain. No
entry there can ever have an epoch older than the current epoch - 1,
and only then the part that's after the wraparound threshold. pg_class
is a raw relation so we can't transform what the user sees via a view
or function.

Tools look at pg_class.relfrozenxid and pg_databse.datfrozenxid more
than probably anything else, so making changes that ignores them is
pretty pointless.

Everything else looks easy and minimally intrusive, but I'm not sure
there's a sensible answer to pg_class.relfrozenxid.



test=> select table_name, column_name from information_schema.columns
where data_type = 'xid';
  table_name  |  column_name
--+---
 pg_class | relfrozenxid
 pg_class | relminmxid
 pg_database  | datfrozenxid
 pg_database  | datminmxid
 pg_locks | transactionid
 pg_prepared_xacts| transaction
 pg_stat_activity | backend_xid
 pg_stat_activity | backend_xmin
 pg_stat_replication  | backend_xmin
 pg_replication_slots | xmin
 pg_replication_slots | catalog_xmin
(11 rows)

test=> SELECT proname FROM pg_proc WHERE prorettype = 'xid'::regtype
OR 'xid'::regtype = ANY (proargtypes);
 proname
--
 xidin
 xidout
 xideq
 age
 mxid_age
 xideqint4
 pg_get_multixact_members
 pg_xact_commit_timestamp
 xidrecv
 xidsend
(10 rows)

test=> SELECT oprname FROM pg_operator WHERE 'xid'::regtype IN
(oprleft, oprright, oprresult);
 oprname
-
 =
 =
(2 rows)





-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bigint vs txid user confusion

2016-12-16 Thread Robert Haas
On Thu, Dec 15, 2016 at 3:02 AM, Craig Ringer  wrote:
> I really wish we could just change the pg_stat_activity and
> pg_stat_replication xid fields to be epoch qualified in a 64-bit wide
> 'fullxid' type, or similar.

I think that approach is worth considering.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] bigint vs txid user confusion

2016-12-15 Thread Craig Ringer
Hi all

I recently had a real world case of a user confused by the
(non)relationship between txid_current()'s output and that of the xid
type, specifically pg_stat_replication.backend_xmin .

I quote:

"
> What should we look for to determine normal? I thought maybe it would
> compare to txid_current(), but these numbers are not at all similar:
>
> XXX=> select txid_current();
>  txid_current
> --
>6311252596
> (1 row)
>
> XXX=> select client_addr, backend_xmin from pg_stat_replication;
>  client_addr  | backend_xmin
> --+--
>  192.168.X.Y |
>  192.168.X.Y  |   2016096136
>  192.168.X.Y  |
>  192.168.X.Y |   2016096136
> (4 rows)


This is a confusing user interface issue in PostgreSQL.

backend_xmin is of type 'xid'. txid_current(), though, returns a
bigint where the high bits are an epoch incremented once per xid
wrap-around, and the low bits are the 32-bit xid.

That's why this output is consistent with the user's two servers having
hot_standby_feedback, but the shown backend_xmin is 4295156460 XIDs
behind the master. That's greater than MAXUINT32 (4294967296)
difference, which seems impossible with a 32-bit transaction ID. It's
because your xid counter has wrapped around once, and
pg_stat_replication doesn't show that, but txid_current() does.

Rather than comparing against txid_current(), the simplest way to get
an indication of how far "behind" the master those XIDs are is to use
the age() function, e.g.

select client_addr, backend_xmin, age(backend_xmin) from
pg_stat_replication;

which will report the difference from the master's xid counter, taking
into account wrap-around etc.

Doing the comparison manually is a bit tricky in SQL.

PostgreSQL really should expose a function to strip the epoch and get
a txid (if the epoch is recent) or null (if the epoch is far in the
past) to make this easier. I submitted one as a part of the
txid_status() patch set and I'll get back to that soon. I just thought
this was relevant.

I really wish we could just change the pg_stat_activity and
pg_stat_replication xid fields to be epoch qualified in a 64-bit wide
'fullxid' type, or similar.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers