[GENERAL] 9.5 "chained equality" behavior

2017-05-30 Thread Joshua Ma
Our team is upgrading from 9.4 to 9.5, and we noticed this behavior change:

9.4:
# SELECT true = true = true;
 ?column?
--
 t
(1 row)

9.5:
# SELECT true = true = true;
ERROR:  syntax error at or near "="
LINE 1: SELECT true = true = true;

Now, there's actually a larger problem with this, since it's not actually
chained equality and only looks like it. It looks like 9.4 is evaluating
right-to-left. We're going to fix usages of this to instead do (a = b && a
= c) instead of (a = b = c).

However, I wanted to email in because I couldn't see what in the 9.5
changelog (https://www.postgresql.org/docs/9.6/static/release-9-5.html)
would cause this to syntax error. I'm worried that there are other
incompatibilities that we didn't notice.

Can anyone shed some light?

-- 
Joshua Ma
Benchling | benchling.com


Re: [GENERAL] Transaction lock granting order

2016-12-05 Thread Joshua Ma
Thanks a bunch Tom, appreciate the quick response.

On Mon, Dec 5, 2016 at 12:33 PM, Tom Lane  wrote:

> Joshua Ma  writes:
> > Can someone point me to documentation on (or confirm) this detail on
> > Postgres locking?
>
> > - Transaction X starts and acquires a lock on a table T
> > - Transaction Y starts and attempts to acquire a conflicting lock on T -
> it
> > is now blocked
> > - Transaction Z starts and also attempts to acquire a conflicting lock
> on T
> > - it is now blocked
>
> > Is txn Y guaranteed to be the first txn to proceed once X finishes?
>
> In isolation, arrival order is respected, but there are cases where it
> would not be.  In particular, lock queues can get reordered to fix
> "soft deadlock" situations where the only alternative to letting Z go
> ahead of Y is to raise a deadlock error.  This would require there being
> other locks in the system besides the ones you mention, of course.
> (And it may well require more than three transactions --- I don't remember
> at the moment what are the user-visible cases where this happens.)
>
> You can find probably more than you want to know about deadlock handling
> in src/backend/storage/lmgr/README.
>
> regards, tom lane
>


[GENERAL] Transaction lock granting order

2016-12-05 Thread Joshua Ma
Can someone point me to documentation on (or confirm) this detail on
Postgres locking?

- Transaction X starts and acquires a lock on a table T
- Transaction Y starts and attempts to acquire a conflicting lock on T - it
is now blocked
- Transaction Z starts and also attempts to acquire a conflicting lock on T
- it is now blocked

Is txn Y guaranteed to be the first txn to proceed once X finishes? Is
there some "lock queue order" that is respected? Is there any chance Z can
proceed before Y, say if it acquires a less aggressive lock?

Put differently: when txn Y starts, it has to wait. Is this wait time at
most the length of the longest txn older than Y?

Thanks,
Josh


[GENERAL] pg_restore casts check constraints differently

2016-03-29 Thread Joshua Ma
This might not be a common case, but we're using pg_dump in a testing
environment to check migrations - 1) we initialize the db from HEAD,
pg_dump it, 2) we initialize the db from migration_base.sql, apply
migrations, pg_dump it, and 3) compare the two dumps to verify that our
migrations are correct wrt schema.

However, we're seeing pg_restore transforming our check constraints with
different casting.

# \d arrayed_library
CONSTRAINT arrayed_library_step_check CHECK (((step)::text = ANY
((ARRAY['ADD_RESERVED_SEQUENCES'::character varying,
'ANALYZE_DESIGN_WARNINGS'::character varying, 'COMPLETE_ORDER'::character
varying, 'DEFINE_VARIANTS'::character varying,
'LABEL_TRANSLATION'::character varying])::text[])))

$ dropdb db && createdb db
$ pg_dump db --schema-only --no-owner > migration_base.sql
# migration_base.sql has the same CONSTRAINT as above
$ psql db -q -f migration_base.sql

# \d arrayed_library
CONSTRAINT arrayed_library_step_check CHECK (((step)::text = ANY
(ARRAY[('ADD_RESERVED_SEQUENCES'::character varying)::text,
('ANALYZE_DESIGN_WARNINGS'::character varying)::text,
('COMPLETE_ORDER'::character varying)::text, ('DEFINE_VARIANTS'::character
varying)::text, ('LABEL_TRANSLATION'::character varying)::text])))

Note that the restored constraint has ARRAY('a'::text, 'b'::text, ...)
while the original had (ARRAY['a', 'b', ...])::text[]

Is there any way to have postgres NOT do the extra conversions?

-- 
- Josh


[GENERAL] Recommendations for migrating PG 9.3 RDS across regions

2015-10-17 Thread Joshua Ma
I'm currently looking into how we might migrate an AWS-hosted database (via
RDS) from us-west-2 to us-east-1. This isn't natively supported by RDS
right now, so I was wondering if anyone has had success with 3rd-party
tools. (I'm also on 9.3, unfortunately, so I lack some of 9.4's replication
features.)

I can always do a pg_dump and pg_restore, but ideally we do some sort of
streaming replication to minimize downtime. Ideally, we'd set up
replication, turn off writes to the primary, promote the replica, flip DNS,
and resume with newly promoted replica as primary. The DB we're migrating
is ~100GB in size - not too huge, but large enough for pg_dump to take
awhile.

For starters, will I run into issues with Londiste, Slony, or Bucardo,
since RDS has more restricted permissions?

Is one of Londiste, Slony, or Bucardo obviously better than the others for
this task? At first glance Bucardo seems the most straightforward, but that
may just be due to the docs being easier to glance through. (I've also
never used any postgres replication outside of the native replication, so
it's possible that some of these tools aren't applicable...)

Any tips/links would be much appreciated.

Best,
Josh
ᐧ


Re: [GENERAL] Why does CREATE INDEX CONCURRENTLY need two scans?

2015-03-31 Thread Joshua Ma
Ah, that's exactly what I was looking for. Thanks everyone for the
responses!

- Josh
ᐧ

On Tue, Mar 31, 2015 at 8:54 PM, Tom Lane  wrote:

> Michael Paquier  writes:
> > On Wed, Apr 1, 2015 at 9:43 AM, Joshua Ma  wrote:
> >> Why are two scans necessary? What would break if it did something like
> the
> >> following?
> >>
> >> 1) insert pg_index entry, wait for relevant txns to finish, mark index
> >> open for inserts
> >>
> >> 2) build index in a single snapshot, mark index valid for searches
>
> >> Wouldn't new inserts update the index correctly? Between the snapshot
> and
> >> index-updating txns afterwards, wouldn't all updates be covered?
>
> > When an index is built with index_build, are included in the index only
> the
> > tuples seen at the start of the first scan. A second scan is needed to
> add
> > in the index entries for the tuples that have been inserted into the
> table
> > during the build phase.
>
> More to the point: Joshua's design supposes that retail insertions into
> an index can happen in parallel with index build.  Or in other words,
> that index build consists of instantaneously creating an empty-but-valid
> index file and then doing a lot of ordinary inserts into it.  That's a
> possible design, but it's not very efficient, and most of our index AMs
> don't do it that way.  btree, for instance, starts by sorting all the
> entries and creating the leaf-level pages.  Then it builds the upper tree
> levels.  It doesn't have a complete tree that could support retail
> insertions until the very end.  Moreover, most of the work is done in
> storage that's local to the backend running CREATE INDEX, and isn't
> accessible to other processes at all.
>
> regards, tom lane
>


Re: [GENERAL] Why does CREATE INDEX CONCURRENTLY need two scans?

2015-03-31 Thread Joshua Ma
Hi Michael,

Isn't that also true during the 2nd scan? I'm assuming new inserts during
the 2nd scan properly update the index, so couldn't the same mechanism
update the index during the 1st scan?

I guess I'm confused because, if you assume a pathological case where all
the data gets inserted after the 1st snapshot, the 1st scan wouldn't pick
anything up and the 3-step process I had earlier becomes identical to the
2-step process. Is there something special about index_build?

- Josh

On Tue, Mar 31, 2015 at 7:08 PM, Michael Paquier 
wrote:

>
>
> On Wed, Apr 1, 2015 at 9:43 AM, Joshua Ma  wrote:
>
>> Hi all,
>>
>> I was curious about why CONCURRENTLY needs two scans to complete - from
>> the documentation on HOT (access/heap/README.HOT), it looks like the
>> process is:
>>
>> 1) insert pg_index entry, wait for relevant in-progress txns to finish
>> (before marking index open for inserts, so HOT updates won't write
>> incorrect index entries)
>> 2) build index in 1st snapshot, mark index open for inserts
>> 3) in 2nd snapshot, validate index and insert missing tuples since first
>> snapshot, mark index valid for searches
>>
>> Why are two scans necessary? What would break if it did something like
>> the following?
>>
>> 1) insert pg_index entry, wait for relevant txns to finish, mark index
>> open for inserts
>>
> 2) build index in a single snapshot, mark index valid for searches
>>
>
>> Wouldn't new inserts update the index correctly? Between the snapshot and
>> index-updating txns afterwards, wouldn't all updates be covered?
>>
>
> When an index is built with index_build, are included in the index only
> the tuples seen at the start of the first scan. A second scan is needed to
> add in the index entries for the tuples that have been inserted into the
> table during the build phase.
> --
> Michael
>


[GENERAL] Why does CREATE INDEX CONCURRENTLY need two scans?

2015-03-31 Thread Joshua Ma
Hi all,

I was curious about why CONCURRENTLY needs two scans to complete - from the
documentation on HOT (access/heap/README.HOT), it looks like the process is:

1) insert pg_index entry, wait for relevant in-progress txns to finish
(before marking index open for inserts, so HOT updates won't write
incorrect index entries)
2) build index in 1st snapshot, mark index open for inserts
3) in 2nd snapshot, validate index and insert missing tuples since first
snapshot, mark index valid for searches

Why are two scans necessary? What would break if it did something like the
following?

1) insert pg_index entry, wait for relevant txns to finish, mark index open
for inserts
2) build index in a single snapshot, mark index valid for searches

Wouldn't new inserts update the index correctly? Between the snapshot and
index-updating txns afterwards, wouldn't all updates be covered?

To be clear, I'm not trying to suggest any changes, just wondering what's
missing from my mental model. :)

Thanks!
Josh
ᐧ