Re: PostgreSQL as advanced job queuing system

2024-03-22 Thread Shaheed Haque
Generally, I'd suggest you think carefully about the nature of the jobs, and draw up a list of must-have properties (performance of course, but also things like whether jobs have to survive planned or unplanned outages, be visible across a WAN, numbers of readers and writers, delivery guarantees,

Re: creating a subset DB efficiently ?

2024-03-09 Thread Shaheed Haque
I go more or less the other way. I have a script which follows the child...parent...grandparent...etc foreign keys in the source database and dumps only the records belonging to the selected "project" (your terminology, in my case it is "client"). I save the dumped data to an archive file. The

Re: Improving pg_dump performance when handling large numbers of LOBs

2024-02-05 Thread Shaheed Haque
Might it be worth a modest amount of time using some basic profiling to see where the time is going? A week is a looonnngg time, even for 150e6 operations. For example, if there an unexpectedly high IO load, some temporary M.2 storage might help? On Tue, 6 Feb 2024, 01:36 Ron Johnson, wrote: >

Re: Scriptable way to validate a pg_dump restore ?

2024-01-30 Thread Shaheed Haque
On Tue, 30 Jan 2024, 05:02 Adrian Klaver, wrote: > On 1/29/24 11:35, Shaheed Haque wrote: > > > > > > On Tue, 30 Jan 2024, 00:27 Adrian Klaver, > <mailto:adrian.kla...@aklaver.com>> wrote: > > > > On 1/29/24 10:12, Shaheed Haque wrote: >

Re: Scriptable way to validate a pg_dump restore ?

2024-01-30 Thread Shaheed Haque
On Tue, 30 Jan 2024, 05:02 Adrian Klaver, wrote: > On 1/29/24 11:35, Shaheed Haque wrote: > > > > > > On Tue, 30 Jan 2024, 00:27 Adrian Klaver, > <mailto:adrian.kla...@aklaver.com>> wrote: > > > > On 1/29/24 10:12, Shaheed Haque wrote: >

Re: Monitoring logical replication

2024-01-30 Thread Shaheed Haque
t; logger -t $PREFIX "OK: Subscription '$PUB' contains > $NUMTAB tables - that is OK. Checking for initial-sync status ..." > break > fi > echo "ERROR: Subscription '$PUB' contains $COUNT tables, but > should contain $NUMTAB table. Re-Chec

Re: Scriptable way to validate a pg_dump restore ?

2024-01-29 Thread Shaheed Haque
On Tue, 30 Jan 2024, 00:27 Adrian Klaver, wrote: > On 1/29/24 10:12, Shaheed Haque wrote: > > > > > > Yes. But I was under the impression that the initial copy of logical > > replication was the same? > > > > Are you taking about the copy_data option

Re: Scriptable way to validate a pg_dump restore ?

2024-01-29 Thread Shaheed Haque
On Mon, 29 Jan 2024, 23:57 Adrian Klaver, wrote: > On 1/29/24 09:28, Shaheed Haque wrote: > > > > > > Right, for me, state, not just record count is what I'm interested in > > (for the initial full table copy part of replication). So, given the > > explanation ab

Re: Scriptable way to validate a pg_dump restore ?

2024-01-29 Thread Shaheed Haque
On Mon, 29 Jan 2024, 22:52 Adrian Klaver, wrote: > On 1/29/24 00:12, Laura Smith wrote: > > Hi > > > > Let's say I've got a scenario where I'm doing a pg_dump replication > rather than online streaming, e.g. due to air-gap or whatever. > > > > Is there a scriptable way to validate the restore ?

Re: Scriptable way to validate a pg_dump restore ?

2024-01-29 Thread Shaheed Haque
I'd also like to know how to do this. The current approaches seem, afaict, to involve making on both end of the connection. Even given the inherently racy nature of the issue, that seems unwieldy to me.

Monitoring logical replication

2023-10-07 Thread Shaheed Haque
Hi, I've been playing with logical replication (currently on PG14), specifically in an AWS RDS Postgres context, but NOT using AWS' own replication tooling. I'm generally familiar with the challenges of distributed systems (such causality, time synchronisation etc), but not especially familiar

Re: Is it possible to index "deep" into a JSONB column?

2022-05-31 Thread Shaheed Haque
I'm unable to make "CREATE INDEX...USING gin ((snapshot -> '$.employee.* ? (@.pay_graph <> 0 || @.last_run_of_employment == true || @.state.employment[last][2] == 0)'))" trigger that. On Tue, 31 May 2022 at 10:16, Shaheed Haque wrote: > > OK, I was ab

Re: Is it possible to index "deep" into a JSONB column?

2022-05-31 Thread Shaheed Haque
USING gin ((snapshot -> '$.employee')); CREATE INDEX bar ON paiyroll_payrun USING gin ((snapshot -> '$.employee.*.pay_graph')); Any thoughts on indexing for this case? If it makes any difference, I'm presently on PG12 and PG13 but looking to move to PG14. Thanks, Shaheed On Mon, 30 Ma

Re: Is it possible to index "deep" into a JSONB column?

2022-05-30 Thread Shaheed Haque
"2000-02-02"}}}') >> ; >> >> select jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth' ) >> from jsonb_table; >> >> -- create index >> create index jpqarr_idx >> on jsonb_table >> using gin ( jsonb_path_query_array( jsonb_col, >> '$.emp

Re: Is it possible to index "deep" into a JSONB column?

2022-05-30 Thread Shaheed Haque
Hi Bryn, On Mon, 30 May 2022 at 03:12, Bryn Llewellyn wrote: ... > > Try this: > >snapshot -> ‘employee’->>’date_of_birth’ > Syntactically, that works: create index bryn on paiyroll_payrun using btree ((snapshot -> 'employee'->>'date_of_birth')); But IIUC it is looking for

Is it possible to index "deep" into a JSONB column?

2022-05-29 Thread Shaheed Haque
Suppose I have a JSONB field called "snapshot". I can create a GIN index on it like this: create index idx1 on mytable using gin (snapshot); In principle, I believe this allows index-assisted access to keys and values nested in arrays and inner objects but in practice, it seems the planner

Re: JSONB index not in use, but is TOAST the real cause of slow query?

2022-05-29 Thread Shaheed Haque
On Sun, 29 May 2022, 15:58 Tom Lane, wrote: > Shaheed Haque writes: > > Unfortunately, the real query which I think should behave very > > similarly is still at the several-seconds level despite using the > > index. ... > > > -> Bitmap Heap Scan on paiyroll

Re: JSONB index not in use, but is TOAST the real cause of slow query?

2022-05-28 Thread Shaheed Haque
2 loops=1) Index Cond: ((snapshot -> 'employee'::text) ? '16376'::text) Planning Time: 0.245 ms Execution Time: 2259.019 ms === IIUC, at the bottom, the indices are doing their thing, but a couple of layers up, the "Bitmap Heap Scan" jumps from ~30ms to

Re: JSONB index not in use, but is TOAST the real cause of slow query?

2022-05-28 Thread Shaheed Haque
suggested and will report back with anything of note. One last thought about TOAST. If the cost of the -> retrieving the data cannot be obviated, is there any way to tweak how that works? Thanks, Shaheed On Sat, 28 May 2022 at 19:41, Tom Lane wrote: > > Shaheed Haqu

JSONB index not in use, but is TOAST the real cause of slow query?

2022-05-28 Thread Shaheed Haque
Hi, I have a database table with a modest number of rows (<1000) but where one column in the table is a JSONB "snapshot" which can be a few MB in size. Generally, this is a great fit for the read-write access patterns involved, but there is a read-query which is VERY slow. I've searched via

Re: Using JSONB with nested key-value structures, and skipping/wildcarding the key

2020-06-02 Thread Shaheed Haque
> > Hi, > > On Mon, 1 Jun 2020 at 23:50, Alban Hertroys wrote: > > On 1 Jun 2020, at 20:18, Shaheed Haque wrote: > > > > Hi, > > > > I'm using Django's ORM to access Postgres12. My "MyModel" table has a > JSONB column called 'snapsho

Using JSONB with nested key-value structures, and skipping/wildcarding the key

2020-06-01 Thread Shaheed Haque
Hi, I'm using Django's ORM to access Postgres12. My "MyModel" table has a JSONB column called 'snapshot'. In Python terms, each row's 'snapshot' looks like this: == snapshot = { 'pay_definition' : { '1234': {..., 'name': 'foo', ...}, '99': {..., 'name':