Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

2017-07-24 Thread David G. Johnston
On Mon, Jul 24, 2017 at 8:11 PM, Tom Lane  wrote:

> ​[*docs]
>  If the data were perfectly distributed, with the same
>  * number of tuples going into each available bucket, then the bucketsize
>  * fraction would be 1/nbuckets.  But this happy state of affairs will
> occur
>  * only if (a) there are at least nbuckets distinct data values, and (b)
>  * we have a not-too-skewed data distribution.  Otherwise the buckets will
>  * be nonuniformly occupied.


​Thanks, I have a better feel now.  Using this example (200 inner relation
rows) is pretty poor since at this scale there doesn't seem to be enough
data to make a noticeable difference.

But anyway, the above comment is only being applied when dealing with a
non-unique ​inner relation; however, the fraction used is 1/nbuckets for
any unique relation regardless of its size.

if (IsA(inner_path, UniquePath))
innerbucketsize = 1.0 / virtualbuckets;
else

And to clarify for others only reading this...the 200 on the "VALUES" node
is there because there are 200 literal values in the value_list.  The 200
on the resulting Hash (and HashAggregate in the example) node is there
because of DEFAULT_NUM_DISTINCT (changing the query limit to 300 only
changed the former).  Further, since it is only the default, the fraction
used charged out is 1/10 instead of 1/200 that would used if the 200 were a
real number instead - or 1/1024 if those 200 rows were known to be
themselves unique.

For me, I'm seeing that the expected number of input rows doesn't factor
into the innerbucketsize computation directly (possibly excepting a scaling
factor adjustment).

I can understand better, now, why this seemingly perfect example of a
semi-join query gets executed with an extra distinct/grouping node.

David J.


Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

2017-07-24 Thread Tom Lane
"David G. Johnston"  writes:
> On Mon, Jul 24, 2017 at 3:46 PM, Tom Lane  wrote:
>> The cost to form the inner hash is basically negligible whether it's
>> de-duped or not, but if it's not (known) de-duped then the cost
>> estimate for the semijoin is going to rise some, and that discourages
>> selecting it.

> ​Why does the "hash semi join" care about duplication of values on the
> inner relation?  Doesn't it only care whether a given bucket exists
> irrespective of its contents?

No, it cares about the average bucket size, ie number of entries that
a probe will have to look at.  The non-de-duped inner relation can't
be assumed to have a flat distribution among the buckets.

> Pointing me to the readme or code file (comments) that explains this in
> more detail would be welcome.

Look for estimate_hash_bucketsize in selfuncs.c and its caller in
costsize.c.  The key point is this argument in that function's
header comment:

 * We are passed the number of buckets the executor will use for the given
 * input relation.  If the data were perfectly distributed, with the same
 * number of tuples going into each available bucket, then the bucketsize
 * fraction would be 1/nbuckets.  But this happy state of affairs will occur
 * only if (a) there are at least nbuckets distinct data values, and (b)
 * we have a not-too-skewed data distribution.  Otherwise the buckets will
 * be nonuniformly occupied.  If the other relation in the join has a key
 * distribution similar to this one's, then the most-loaded buckets are
 * exactly those that will be probed most often.  Therefore, the "average"
 * bucket size for costing purposes should really be taken as something close
 * to the "worst case" bucket size.  We try to estimate this by adjusting the
 * fraction if there are too few distinct data values, and then scaling up
 * by the ratio of the most common value's frequency to the average frequency.
 *
 * If no statistics are available, use a default estimate of 0.1.  This will
 * discourage use of a hash rather strongly if the inner relation is large,
 * which is what we want.  We do not want to hash unless we know that the
 * inner rel is well-dispersed (or the alternatives seem much worse).

That's certainly a bit pessimistic, but the thing to remember about any
hashing algorithm is that occasionally it will eat your lunch.  We prefer
to go to sort-based algorithms if there seems to be a risk of the hash
degenerating to O(N^2).

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

2017-07-24 Thread David G. Johnston
On Mon, Jul 24, 2017 at 7:58 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Mon, Jul 24, 2017 at 3:46 PM, Tom Lane  wrote:
>
>>
>> The cost to form the inner hash is basically negligible whether it's
>> de-duped or not, but if it's not (known) de-duped then the cost
>> estimate for the semijoin is going to rise some, and that discourages
>> selecting it.
>>
>
> ​Why does the "hash semi join" care about duplication of values on the
> inner relation?  Doesn't it only care whether a given bucket exists
> irrespective of its contents?
>

​Rather, it cares about the contents is-so-far as confirming that at least
one of the tuples in the bucket indeed has the same joining value as the
outer relation (lost track of the fact that two values can share the same
hash).  But once it finds one it can move onto the new outer relation tuple
while an inner join would have to spend more time looking for additional
matches.

David J.
​


Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

2017-07-24 Thread David G. Johnston
On Mon, Jul 24, 2017 at 3:46 PM, Tom Lane  wrote:

>
> The cost to form the inner hash is basically negligible whether it's
> de-duped or not, but if it's not (known) de-duped then the cost
> estimate for the semijoin is going to rise some, and that discourages
> selecting it.
>

​Why does the "hash semi join" care about duplication of values on the
inner relation?  Doesn't it only care whether a given bucket exists
irrespective of its contents?

Looking at those explains it would seem the "hash semi join" is simply an
inherently more expensive to execute compared to a "hash join" and that the
act of de-duping the inner relation would have to be quite expensive to
overcome the gap.  I cannot reconcile this with the previous paragraph
though...

Pointing me to the readme or code file (comments) that explains this in
more detail would be welcome.  Not sure what to grep for - "Hash Semi Join"
only turns up a couple of expected output results...

Thx.

David J.


Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

2017-07-24 Thread Jeff Janes
On Jul 24, 2017 14:19, "PT"  wrote:

On Mon, 24 Jul 2017 13:17:56 +0300
Dmitry Lazurkin  wrote:

> On 07/24/2017 01:40 AM, PT wrote:
> > In this example you count approximately 40,000,000 values, which is
> > about 40% of the table.
>
> 4 000 000 (:
>
> > If you really need these queries to be faster, I would suggest
> > materializing the data, i.e. create a table like:
> >
> > CREATE TABLE id_counts (
> >  id BIGINT PRIMARY KEY,
> >  num BIGINT
> > )
> >
> > Then use a trigger or similar technique to keep id_counts in sync
> > with the id table. You can then run queries of the form:
> >
> > SELECT sum(num) FROM id_counts WHERE id IN :values:
> >
> > which I would wager houseboats will be significantly faster.
> I use count only for example because it uses seqscan. I want optimize
> IN-clause ;-).

The IN clause is not what's taking all the time. It's the processing of
millions of rows that's taking all the time.


It isn't either-or.  It is the processing of millions of rows over the
large in-list which is taking the time. Processing an in-list as a hash
table would be great, but no one has gotten around to it implementing it
yet.  Maybe Dmitry will be the one to do that.

Cheers,

Jeff


Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

2017-07-24 Thread Tom Lane
"David G. Johnston"  writes:
> On Mon, Jul 24, 2017 at 3:22 PM, Dmitry Lazurkin  wrote:
>> ALTER TABLE ids ALTER COLUMN id SET NOT NULL;
>> EXPLAIN (ANALYZE, BUFFERS) SELECT count(*) FROM ids WHERE id IN
>> :values_clause;
>> 
>> Aggregate  (cost=245006.46..245006.47 rows=1 width=8) (actual
>> time=3824.095..3824.095 rows=1 loops=1)
>> Buffers: shared hit=44248
>> ->  Hash Join  (cost=7.50..235006.42 rows=419 width=0) (actual
>> time=1.108..3327.112 rows=3998646 loops=1)
>> ...

> ​You haven't constrained the outer relation (i.e., :values_clause) to be
> non-null which is what I believe is required for the semi-join algorithm to
> be considered.​

No, the planner is thinking about semi-join, it just decides it prefers
to de-dup and then do a plain join.  I believe this is mainly because it
lacks statistics about the inner relation and is conservative about what
it assumes about the number of duplicates in the absence of stats.
But you can force it.  Taking the original example (and being sure to
have ANALYZE'd ids):

regression=# EXPLAIN (ANALYZE, BUFFERS) SELECT count(*) FROM ids WHERE id IN
:values_clause;
QUERY PLAN  
   
---
 Aggregate  (cost=245006.16..245006.17 rows=1 width=8) (actual 
time=3550.581..3550.581 rows=1 loops=1)
   Buffers: shared hit=2208 read=42040
   ->  Hash Join  (cost=7.50..235006.13 rows=413 width=0) (actual 
time=0.494..3093.100 rows=4002875 loops=1)
 Hash Cond: (ids.id = "*VALUES*".column1)
 Buffers: shared hit=2208 read=42040
 ->  Seq Scan on ids  (cost=0.00..144248.33 rows=1033 width=8) 
(actual time=0.071..1118.278 rows=1000 loops=1)
   Buffers: shared hit=2208 read=42040
 ->  Hash  (cost=5.00..5.00 rows=200 width=4) (actual time=0.404..0.404 
rows=200 loops=1)
   Buckets: 1024  Batches: 1  Memory Usage: 16kB
   ->  HashAggregate  (cost=3.00..5.00 rows=200 width=4) (actual 
time=0.267..0.332 rows=200 loops=1)
 Group Key: "*VALUES*".column1
 ->  Values Scan on "*VALUES*"  (cost=0.00..2.50 rows=200 
width=4) (actual time=0.003..0.134 rows=200 loops=1)
 Planning time: 0.561 ms
 Execution time: 3550.700 ms
(14 rows)

regression=# set enable_hashagg TO 0;
SET
regression=# EXPLAIN (ANALYZE, BUFFERS) SELECT count(*) FROM ids WHERE id IN
:values_clause;
   QUERY PLAN   
 
-
 Aggregate  (cost=245012.31..245012.32 rows=1 width=8) (actual 
time=3553.194..3553.194 rows=1 loops=1)
   Buffers: shared hit=2240 read=42008
   ->  Hash Join  (cost=13.64..235012.28 rows=413 width=0) (actual 
time=0.545..3093.434 rows=4002875 loops=1)
 Hash Cond: (ids.id = "*VALUES*".column1)
 Buffers: shared hit=2240 read=42008
 ->  Seq Scan on ids  (cost=0.00..144248.33 rows=1033 width=8) 
(actual time=0.072..1118.853 rows=1000 loops=1)
   Buffers: shared hit=2240 read=42008
 ->  Hash  (cost=11.14..11.14 rows=200 width=4) (actual 
time=0.452..0.452 rows=200 loops=1)
   Buckets: 1024  Batches: 1  Memory Usage: 16kB
   ->  Unique  (cost=10.14..11.14 rows=200 width=4) (actual 
time=0.227..0.384 rows=200 loops=1)
 ->  Sort  (cost=10.14..10.64 rows=200 width=4) (actual 
time=0.226..0.276 rows=200 loops=1)
   Sort Key: "*VALUES*".column1
   Sort Method: quicksort  Memory: 35kB
   ->  Values Scan on "*VALUES*"  (cost=0.00..2.50 
rows=200 width=4) (actual time=0.003..0.134 rows=200 loops=1)
 Planning time: 0.567 ms
 Execution time: 3553.297 ms
(16 rows)

regression=# set enable_sort TO 0;
SET
regression=# EXPLAIN (ANALYZE, BUFFERS) SELECT count(*) FROM ids WHERE id IN
:values_clause;
  QUERY PLAN
   
---
 Aggregate  (cost=320003.90..320003.91 rows=1 width=8) (actual 
time=3548.364..3548.364 rows=1 loops=1)
   Buffers: shared hit=2272 read=41976
   ->  Hash Semi Join  (cost=5.00..310003.87 rows=413 width=0) (actual 
time=0.331..3091.235 rows=4002875 loops=1)
 Hash Cond: (ids.id = "*VALUES*".column1)
 Buffers: shared hit=2272 read=41976
 ->  Seq Scan on ids  (cost=0.00..144248.33 rows=1033 width=8) 
(actual time=0.071..1117.761 

Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

2017-07-24 Thread Dmitry Lazurkin

On 25.07.2017 01:25, David G. Johnston wrote:
On Mon, Jul 24, 2017 at 3:22 PM, Dmitry Lazurkin >wrote:


ALTER TABLE ids ALTER COLUMN id SET NOT NULL;
EXPLAIN (ANALYZE, BUFFERS) SELECT count(*) FROM ids WHERE id IN
:values_clause;

 Aggregate  (cost=245006.46..245006.47 rows=1 width=8) (actual
time=3824.095..3824.095 rows=1 loops=1)
   Buffers: shared hit=44248
   ->  Hash Join  (cost=7.50..235006.42 rows=419 width=0)
(actual time=1.108..3327.112 rows=3998646 loops=1)
   ...


​You haven't constrained the outer relation (i.e., :values_clause) to 
be non-null which is what I believe is required for the semi-join 
algorithm to be considered.​


David J.


CREATE TABLE second_ids (i bigint);
INSERT INTO second_ids :values_clause;

EXPLAIN (ANALYZE, BUFFERS) SELECT count(*) FROM ids WHERE id IN (select 
i from second_ids);


 Aggregate  (cost=225004.36..225004.37 rows=1 width=8) (actual 
time=3826.641..3826.641 rows=1 loops=1)

   Buffers: shared hit=44249
   ->  Hash Semi Join  (cost=5.50..215004.32 rows=419 width=0) 
(actual time=0.352..3338.601 rows=3998646 loops=1)

 Hash Cond: (ids.id = second_ids.i)
 Buffers: shared hit=44249
 ->  Seq Scan on ids  (cost=0.00..144248.48 rows=1048 
width=8) (actual time=0.040..1069.006 rows=1000 loops=1)

   Buffers: shared hit=44248
 ->  Hash  (cost=3.00..3.00 rows=200 width=8) (actual 
time=0.288..0.288 rows=200 loops=1)

   Buckets: 1024  Batches: 1  Memory Usage: 16kB
   Buffers: shared hit=1
   ->  Seq Scan on second_ids  (cost=0.00..3.00 rows=200 
width=8) (actual time=0.024..0.115 rows=200 loops=1)

 Buffers: shared hit=1
 Planning time: 0.413 ms
 Execution time: 3826.752 ms

Hash Semi-Join without NOT NULL constraint on second table.


Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

2017-07-24 Thread Dmitry Lazurkin

On 25.07.2017 01:15, David G. Johnston wrote:
On Mon, Jul 24, 2017 at 3:12 PM, Dmitry Lazurkin >wrote:


And I have one question. I don't understand why IN-VALUES doesn't
use Semi-Join? PostgreSQL has Hash Semi-Join...  For which task
the database has node of this type?


​Semi-Join is canonically written as:

SELECT *
FROM tbl
WHERE EXISTS (SELECT 1 FROM tbl2 WHERE tbl.id  = 
tbl2.id )


The main difference between IN and EXISTS is NULL semantics.

David J.



ALTER TABLE ids ALTER COLUMN id SET NOT NULL;
EXPLAIN (ANALYZE, BUFFERS) SELECT count(*) FROM ids WHERE id IN 
:values_clause;


 Aggregate  (cost=245006.46..245006.47 rows=1 width=8) (actual 
time=3824.095..3824.095 rows=1 loops=1)

   Buffers: shared hit=44248
   ->  Hash Join  (cost=7.50..235006.42 rows=419 width=0) (actual 
time=1.108..3327.112 rows=3998646 loops=1)

   ...

Hmmm. No Semi-Join.


PostgreSQL can use Semi-Join for IN too.



Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

2017-07-24 Thread David G. Johnston
On Mon, Jul 24, 2017 at 3:22 PM, Dmitry Lazurkin  wrote:

> ALTER TABLE ids ALTER COLUMN id SET NOT NULL;
> EXPLAIN (ANALYZE, BUFFERS) SELECT count(*) FROM ids WHERE id IN
> :values_clause;
>
>  Aggregate  (cost=245006.46..245006.47 rows=1 width=8) (actual
> time=3824.095..3824.095 rows=1 loops=1)
>Buffers: shared hit=44248
>->  Hash Join  (cost=7.50..235006.42 rows=419 width=0) (actual
> time=1.108..3327.112 rows=3998646 loops=1)
>...
>

​You haven't constrained the outer relation (i.e., :values_clause) to be
non-null which is what I believe is required for the semi-join algorithm to
be considered.​

David J.


Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

2017-07-24 Thread David G. Johnston
On Mon, Jul 24, 2017 at 3:12 PM, Dmitry Lazurkin  wrote:

> And I have one question. I don't understand why IN-VALUES doesn't use
> Semi-Join? PostgreSQL has Hash Semi-Join...  For which task the database
> has node of this type?
>

​Semi-Join is canonically written as:

SELECT *
FROM tbl
WHERE EXISTS (SELECT 1 FROM tbl2 WHERE tbl.id = tbl2.id)

The main difference between IN and EXISTS is NULL semantics.

David J.


Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

2017-07-24 Thread Dmitry Lazurkin

On 25.07.2017 00:31, David G. Johnston wrote:


Basically you want to write something like:

SELECT *
FROM ids
JOIN ( :values_clause ) vc (vid) ON (vc.vid = ids.id )​

or

WITH vc AS (SELECT vid FROM  ORDER BY ... LIMIT )
SELECT *
FROM ids
JOIN vc ON (vid = ids.id )



This query uses JOIN plan node as IN (VALUES ...).

And I have one question. I don't understand why IN-VALUES doesn't use 
Semi-Join? PostgreSQL has Hash Semi-Join...  For which task the database 
has node of this type?




Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

2017-07-24 Thread Dmitry Lazurkin

On 25.07.2017 00:17, PT wrote:

The IN clause is not what's taking all the time. It's the processing of
millions of rows that's taking all the time.


IN (...) - 17 sec
IN (VALUES ...) - 4 sec
So performance issue is with IN-clause.


Perhaps you should better describe what it is you really want to accomplish.
Regardless of what it is, if it involves processing many millions of rows,
you're probably going to need to do some sort of materialization.


I try to find better solutions for IN-task.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

2017-07-24 Thread David G. Johnston
On Sun, Jul 23, 2017 at 4:35 AM, dilaz03 .  wrote:

> - IN-VALUES clause adds new node to plan. Has additional node big
> overhead? How about filter by two or more IN-VALUES clause?
>

​IN-VALUES is just another word for "TABLE" which is another word for
"RELATION".  Writing relational database queries that use explicit
relations is generally going to give you the best performance.

Basically you want to write something like:

SELECT *
FROM ids
JOIN ( :values_clause ) vc (vid) ON (vc.vid = ids.id)​

or

WITH vc AS (SELECT vid FROM  ORDER BY ... LIMIT )
SELECT *
FROM ids
JOIN vc ON (vid = ids.id)

"IN ('l1','l2','l3')" is nice and all but as demonstrated the mechanics of
executing that are different, and slower, than processing relations and
tuples.  For a small number of items the difference is generally not
meaningful and so the convenience of writing (IN (...)) is worth taking.

David J.


Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

2017-07-24 Thread PT
On Mon, 24 Jul 2017 13:17:56 +0300
Dmitry Lazurkin  wrote:

> On 07/24/2017 01:40 AM, PT wrote:
> > In this example you count approximately 40,000,000 values, which is
> > about 40% of the table. 
>
> 4 000 000 (:
>
> > If you really need these queries to be faster, I would suggest
> > materializing the data, i.e. create a table like:
> >
> > CREATE TABLE id_counts (
> >  id BIGINT PRIMARY KEY,
> >  num BIGINT
> > )
> >
> > Then use a trigger or similar technique to keep id_counts in sync
> > with the id table. You can then run queries of the form:
> >
> > SELECT sum(num) FROM id_counts WHERE id IN :values:
> >
> > which I would wager houseboats will be significantly faster.
> I use count only for example because it uses seqscan. I want optimize
> IN-clause ;-).

The IN clause is not what's taking all the time. It's the processing of
millions of rows that's taking all the time.

Perhaps you should better describe what it is you really want to accomplish.
Regardless of what it is, if it involves processing many millions of rows,
you're probably going to need to do some sort of materialization.

-- 
PT 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Monitoring of a hot standby with a largely idle master

2017-07-24 Thread Jeff Janes
On Sun, Jul 16, 2017 at 8:47 AM, Michael Paquier 
wrote:

> On Fri, Jul 14, 2017 at 9:11 PM, Jeff Janes  wrote:
> > On Thu, Jul 13, 2017 at 10:38 AM, Michael Paquier
> >  wrote:
> >>
> >> On Thu, Jul 13, 2017 at 7:23 PM, Jeff Janes 
> wrote:
> >> >
> >> > I think that pg_stat_wal_receiver should be crossreferenced in
> >> > https://www.postgresql.org/docs/9.6/static/hot-standby.html, near the
> >> > same
> >> > place which it crossreferences table 9-79.  That would make it more
> >> > discoverable.
> >>
> >> Hm. Hot standby may not involve streaming replication. What about a
> >> paragraph here instead?
> >>
> >> https://www.postgresql.org/docs/devel/static/warm-
> standby.html#streaming-replication
> >>
> >> In the monitoring subsection, we could tell that on a standby the WAL
> >> receiver status can be retrieved from this view when changes are
> >> streamed. What do you think?
> >
> >
> > That works for me.
>
> What do you think about the patch attached?
>

Looks OK.  Should it mention specifically "On a hot standby" rather than
"On a standby"?  Otherwise people might be left confused on how they are
supposed to do this on a generic standby.  It is the kind of thing which is
obvious once you know it, but confusing the first time you encounter it.



>
>  
>   You can retrieve a list of WAL sender processes via the
> - 
> + 
>   pg_stat_replication view. Large differences
> between
> In the previous paragraph I have noticed that the link reference is
> incorrect. pg_stat_replication is listed under
> monitoring-stats-dynamic-views-table.
>

Yes, that is clearly wrong.  But why not link directly to the description
of the view itself, pg-stat-replication-view, rather than the correct table
which mentions the view?  Is that the accepted docs style to link to the
more generic place?  (Same thing applies to your patch, it could link
directly to pg-stat-wal-receiver-view.

Sorry for the delay, it took me awhile to get the new doc build system to
work (solution seems to be, "Don't use CentOS6 anymore")

Cheers,

Jeff


Re: [GENERAL] pg_restore misuse or bug?

2017-07-24 Thread Jordan Gigov
Assuming you have a user called "test", this will create a database that
suffers from this problem.

create database mvtest;
\c mvtest
create table main_table (id serial not null, something varchar(20), primary
key (id));
create table child_table (id serial not null, parent_id int not null,
somedate date not null, someval int not null, primary key(id), foreign
key(parent_id) references main_table(id));

insert into main_table(something) values('X-Men'),('Batman');
insert into child_table(parent_id, somedate, someval)
values(2,'1989-06-23',10),(2,'1992-06-19',4),(1,'2000-07-14',13),(1,'2014-05-23',16);

CREATE MATERIALIZED VIEW movie_things AS
SELECT mt.*, jsonb_object(array_agg(ct.somedate)::text[],
array_agg(ct.someval)::text[]) AS release_prizes FROM main_table mt
LEFT JOIN child_table ct ON (mt.id = ct.parent_id) GROUP BY mt.id;
CREATE UNIQUE INDEX IF NOT EXISTS movie_things_id_idx ON movie_things USING
btree (id);
ALTER MATERIALIZED VIEW movie_things CLUSTER ON movie_things_id_idx, OWNER
TO test;

On 21 July 2017 at 17:25, Jordan Gigov  wrote:

> This is on version 9.5, 9.6 and 10beta2. I could probably make a
> test-case over the weekend if I'm at home.
>
> On 21 July 2017 at 17:03, Tom Lane  wrote:
> > Jordan Gigov  writes:
> >> When running pg_restore as the superuser it gives the following error
> >> pg_restore: [archiver (db)] Error while PROCESSING TOC:
> >> pg_restore: [archiver (db)] Error from TOC entry 3253; 0 320628
> >> MATERIALIZED VIEW DATA combined_query_data web_user
> >> pg_restore: [archiver (db)] could not execute query: ERROR:
> >> permission denied for relation first_table_in_from_list
> >> Command was: REFRESH MATERIALIZED VIEW combined_query_data;
> >
> > What PG version is this?  Can you provide a self-contained test case?
> >
> >> I see no reason why the superuser would get a "permission denied"
> >> error.
> >
> > Matview queries are run as the owner of the matview, so this isn't
> > as surprising as all that.  But if the matview works in your normal
> > usage, then pg_dump must be doing something wrong, perhaps emitting
> > grants in the wrong order.
> >
> > regards, tom lane
>


Re: [GENERAL] monitoring PostgreSQL

2017-07-24 Thread Tim Clarke
On 24/07/17 15:01, PAWAN SHARMA wrote:
>
>
>
> On Mon, Jul 24, 2017 at 7:28 PM, Tim Clarke  > wrote:
>
>
>
> On 24/07/17 14:47, PAWAN SHARMA wrote:
> >
> > Hi Tim,
> >
> > Facing below issue
> >
> > [abc@test:/home/psharm89/nrpe-2.15]#
> > Redirecting to /bin/systemctl restart  xinetd.service
> > Failed to restart xinetd.service: Unit xinetd.service failed to
> load:
> > No such file or directory.
> > [abc@test:/home/psharm89/nrpe-2.15]#
> >
> > [abc@test:/home/psharm89/nrpe-2.15]#
> > #-> less /etc/xinetd.d/nrpe
> > # default: on
> > # description: NRPE (Nagios Remote Plugin Executor)
> > service nrpe
> > {
> > flags   = REUSE
> > socket_type = stream
> > port= 5666
> > wait= no
> > user= nagios
> > group   = nagios
> > server  = /usr/local/nagios/bin/nrpe
> > server_args = -c /usr/local/nagios/etc/nrpe.cfg --inetd
> > log_on_failure  += USERID
> > disable = no
> > only_from   = 127.0.0.1
> > }
> >
> >
> >
>
> xinetd is broken? restore the config from backup.
>
> Tim
>
>
> #-> /etc/init.d/nagios restart 
> Restarting nagios (via systemctl):  Job for nagios.service failed
> because the control process exited with error code. See "systemctl
> status nagios.service" and "journalctl -xe" for details.
>

And what do

"systemctl status nagios.service"
and
"journalctl -xe"

tell you?

Tim



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [GENERAL] monitoring PostgreSQL

2017-07-24 Thread PAWAN SHARMA
On Mon, Jul 24, 2017 at 7:28 PM, Tim Clarke 
wrote:

>
>
> On 24/07/17 14:47, PAWAN SHARMA wrote:
> >
> > Hi Tim,
> >
> > Facing below issue
> >
> > [abc@test:/home/psharm89/nrpe-2.15]#
> > Redirecting to /bin/systemctl restart  xinetd.service
> > Failed to restart xinetd.service: Unit xinetd.service failed to load:
> > No such file or directory.
> > [abc@test:/home/psharm89/nrpe-2.15]#
> >
> > [abc@test:/home/psharm89/nrpe-2.15]#
> > #-> less /etc/xinetd.d/nrpe
> > # default: on
> > # description: NRPE (Nagios Remote Plugin Executor)
> > service nrpe
> > {
> > flags   = REUSE
> > socket_type = stream
> > port= 5666
> > wait= no
> > user= nagios
> > group   = nagios
> > server  = /usr/local/nagios/bin/nrpe
> > server_args = -c /usr/local/nagios/etc/nrpe.cfg --inetd
> > log_on_failure  += USERID
> > disable = no
> > only_from   = 127.0.0.1
> > }
> >
> >
> >
>
> xinetd is broken? restore the config from backup.
>
> Tim
>
>
#-> /etc/init.d/nagios restart
Restarting nagios (via systemctl):  Job for nagios.service failed because
the control process exited with error code. See "systemctl status
nagios.service" and "journalctl -xe" for details.


Re: [GENERAL] monitoring PostgreSQL

2017-07-24 Thread Tim Clarke


On 24/07/17 14:47, PAWAN SHARMA wrote:
>
> Hi Tim,
>
> Facing below issue
>
> [abc@test:/home/psharm89/nrpe-2.15]#
> Redirecting to /bin/systemctl restart  xinetd.service
> Failed to restart xinetd.service: Unit xinetd.service failed to load:
> No such file or directory.
> [abc@test:/home/psharm89/nrpe-2.15]# 
>
> [abc@test:/home/psharm89/nrpe-2.15]#
> #-> less /etc/xinetd.d/nrpe
> # default: on
> # description: NRPE (Nagios Remote Plugin Executor)
> service nrpe
> {
> flags   = REUSE
> socket_type = stream
> port= 5666
> wait= no
> user= nagios
> group   = nagios
> server  = /usr/local/nagios/bin/nrpe
> server_args = -c /usr/local/nagios/etc/nrpe.cfg --inetd
> log_on_failure  += USERID
> disable = no
> only_from   = 127.0.0.1 
> }
>
>
>

xinetd is broken? restore the config from backup.

Tim



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [GENERAL] monitoring PostgreSQL

2017-07-24 Thread PAWAN SHARMA
On Mon, Jul 24, 2017 at 5:50 PM, PAWAN SHARMA 
wrote:

>
> On Mon, Jul 24, 2017 at 5:20 PM, Tim Clarke 
> wrote:
>
>> We use nagios for all alerting, it'll do what you need.
>>
>> Tim Clarke
>>
>>
>> On 24/07/17 12:38, Achilleas Mantzios wrote:
>> > pgbadger is a very nice reporting tool, overall, albeit not exactly at
>> > the system side, but more to the DBA side.
>> > For system level monitoring maybe take a look here :
>> > https://wiki.postgresql.org/wiki/Monitoring#check_postgres .
>> > Also you might want to write a script that parses logs for FATAL and
>> > PANIC and sends out emails.
>> >
>> > On 24/07/2017 14:27, PAWAN SHARMA wrote:
>> >> Hi All,
>> >>
>> >>
>> >> Please provide me a list of tools which we can use for monitoring
>> >> PostgreSQL.
>> >>
>> >> -Monitor all the services and health of server
>> >> -Able to send critical and warning alert on mail.
>> >>
>> >> OS: Redhat-7
>> >> PostgreSQL Version: 9.5.7
>> >>
>> >>
>> >> -Pawan
>> >
>> >
>> > --
>> > Achilleas Mantzios
>> > IT DEV Lead
>> > IT DEPT
>> > Dynacom Tankers Mgmt
>>
>>
>>
> Hi Tim,
>
> Thanks for update,
>
> Please share the steps, how to configure Nagios??
>


Hi Tim,

Facing below issue

[abc@test:/home/psharm89/nrpe-2.15]#
Redirecting to /bin/systemctl restart  xinetd.service
Failed to restart xinetd.service: Unit xinetd.service failed to load: No
such file or directory.
[abc@test:/home/psharm89/nrpe-2.15]#

[abc@test:/home/psharm89/nrpe-2.15]#
#-> less /etc/xinetd.d/nrpe
# default: on
# description: NRPE (Nagios Remote Plugin Executor)
service nrpe
{
flags   = REUSE
socket_type = stream
port= 5666
wait= no
user= nagios
group   = nagios
server  = /usr/local/nagios/bin/nrpe
server_args = -c /usr/local/nagios/etc/nrpe.cfg --inetd
log_on_failure  += USERID
disable = no
only_from   = 127.0.0.1
}


Re: [GENERAL] Dealing with ordered hierarchies

2017-07-24 Thread Tim Uckun
I don't like the approach with a large increment. It would mean complicated
logic to see if you filled the gap and then update all the other peers if
you did. It sounds like the re-order is going to be expensive no matter
what. My primary concern are race conditions though. What if two or more
users are trying to update the hierarchy either by inserts or updates? I
can definitely see a situation where we have issues transactions trip over
each other.

On Mon, Jul 24, 2017 at 10:32 PM, Alban Hertroys  wrote:

>
> > On 24 Jul 2017, at 9:02, Tim Uckun  wrote:
> >
> > I have read many articles about dealing with hierarchies in postgres
> including nested sets, ltree, materialized paths, using arrays as
> parentage,  CTEs etc but nobody talks about the following scenario.
> >
> > Say I have a hierarchy like this
> >
> > 1
> > 1.1
> > 1.1.1
> > 1.1.2
> > 1.2
> > 1.3
> > 2
> > 2.1
> >
> > In this hierarchy the order is very important and I want to run
> frequent(ish) re-ordering of both subsets and entire trees and even more
> frequent inserts.
>
> Since they're hierarchies, the order is already in the structure of the
> data. Do you really need to add it to the data or would it suffice to add
> it to the query result?
>
> If that's the case, you only need a simple ordering number per branch,
> like 1, 2, 3, etc. The full path (ie. '1.1.3') gets generated in the query.
>
> I regularly generate structures like your above example using recursive
> CTE's. The "path" helps to get the results in the correct order for
> starters (although you're in for a surprise if any of your levels go past 9
> in the above). It's great how you can "trickle" all kinds of calculations
> through the hierarchy using CTE's.
>
> Something like this should help to get you started (untested, I usually do
> this in Oracle, which has several peculiarities):
>
> with recursive hierarchy (parent, node, sequence_number, path) as (
> select null, node, sequence_number, sequence_number::text from
> table
> union all
> select h.node, t.node, t.sequence_number, h.path || '.' ||
> t.sequence_number::text
>   from table t
>   join hierarchy h on (t.parent = h.node)
> )
> select node, path
>   from hierarchy
>
> Where the table "table" has fields:
> parent  -- parent node
> node-- actual node
> sequence_number -- Order of sequence of this node within its
> parent branch
>
> You may need to add a surrogate key if your parent/child combinations are
> otherwise not unique. That would then also be the way to address a node
> directly (otherwise it would be (parent, node)).
>
> For the sequence_number I'd probably just use an actual sequence generator
> with a large enough gap to prevent problems with reordering items later on
> (increment by 10 for example). You will also want to pad the sequence
> numbers in the "path" column with leading zeroes (otherwise 10 sorts
> between 1 and 2, etc.), enough that you won't run out of numbers per level.
>
> If you require your sequence numbers to be subsequent in the result: You
> can add a field with such numbering based on the existing sequence_numbers,
> by using a windowing function in each branch of the union - it's down to a
> fairly basic row numbering problem at this point.
>
> > Scenario 1: I want to insert a child into the 1.1 subtree.  The next
> item should be 1.1.3 and I can't figure out any other way to do this other
> than to subquery the children and to figure out the max child ID, add one
> to it which is a race condition waiting to happen.
>
> You would first need to determine which node is the parent node by
> traversing the hierarchy up to the point of insertion and use the (parent,
> node) or surrogate key fields to append under. Similar to using '1.1',
> really.
>
> > Scenario 2: I now decide the recently inserted item is the second most
> important so I reset the ID to 1.1.2 and then increment 1.1.2 (and possibly
> everything below).  Again this is both prone to race conditions and
> involves a heavy update.
>
> No need to bother with that (much) with the above approach. And if you do
> run out of gaps, you can fairly simply update all the sequence numbers
> under the same parent without causing concurrency issues and without
> requiring locks/synchronisation.
>
> > Is there a better way to deal with this or is the complexity unavoidable?
>
> I think it's better, but I don't think its ideal. It's fairly complicated
> to understand, for one thing, which can cause problems for maintenance (I
> have colleagues who don't dare to touch my queries, for example).
>
> > I should state that like most database reads will be much more frequent
> than writes and inserts will be more frequent than updates (re-ordering)
>
> More of the logic (and thus system load) gets moved to the read-side of
> things, that's probably a drawback, but most of it is just keeping state
> and counting. I don't expect 

Re: [GENERAL] monitoring PostgreSQL

2017-07-24 Thread PAWAN SHARMA
On Mon, Jul 24, 2017 at 5:20 PM, Tim Clarke 
wrote:

> We use nagios for all alerting, it'll do what you need.
>
> Tim Clarke
>
>
> On 24/07/17 12:38, Achilleas Mantzios wrote:
> > pgbadger is a very nice reporting tool, overall, albeit not exactly at
> > the system side, but more to the DBA side.
> > For system level monitoring maybe take a look here :
> > https://wiki.postgresql.org/wiki/Monitoring#check_postgres .
> > Also you might want to write a script that parses logs for FATAL and
> > PANIC and sends out emails.
> >
> > On 24/07/2017 14:27, PAWAN SHARMA wrote:
> >> Hi All,
> >>
> >>
> >> Please provide me a list of tools which we can use for monitoring
> >> PostgreSQL.
> >>
> >> -Monitor all the services and health of server
> >> -Able to send critical and warning alert on mail.
> >>
> >> OS: Redhat-7
> >> PostgreSQL Version: 9.5.7
> >>
> >>
> >> -Pawan
> >
> >
> > --
> > Achilleas Mantzios
> > IT DEV Lead
> > IT DEPT
> > Dynacom Tankers Mgmt
>
>
>
Hi Tim,

Thanks for update,

Please share the steps, how to configure Nagios??


Re: [GENERAL] monitoring PostgreSQL

2017-07-24 Thread Tim Clarke
We use nagios for all alerting, it'll do what you need.

Tim Clarke
 

On 24/07/17 12:38, Achilleas Mantzios wrote:
> pgbadger is a very nice reporting tool, overall, albeit not exactly at
> the system side, but more to the DBA side.
> For system level monitoring maybe take a look here :
> https://wiki.postgresql.org/wiki/Monitoring#check_postgres .
> Also you might want to write a script that parses logs for FATAL and
> PANIC and sends out emails.
>
> On 24/07/2017 14:27, PAWAN SHARMA wrote:
>> Hi All,
>>
>>
>> Please provide me a list of tools which we can use for monitoring
>> PostgreSQL.
>>
>> -Monitor all the services and health of server 
>> -Able to send critical and warning alert on mail.
>>
>> OS: Redhat-7
>> PostgreSQL Version: 9.5.7
>>
>>
>> -Pawan
>
>
> -- 
> Achilleas Mantzios
> IT DEV Lead
> IT DEPT
> Dynacom Tankers Mgmt




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [GENERAL] monitoring PostgreSQL

2017-07-24 Thread Achilleas Mantzios

pgbadger is a very nice reporting tool, overall, albeit not exactly at the 
system side, but more to the DBA side.
For system level monitoring maybe take a look here : 
https://wiki.postgresql.org/wiki/Monitoring#check_postgres .
Also you might want to write a script that parses logs for FATAL and PANIC and 
sends out emails.

On 24/07/2017 14:27, PAWAN SHARMA wrote:

Hi All,


Please provide me a list of tools which we can use for monitoring PostgreSQL.

-Monitor all the services and health of server
-Able to send critical and warning alert on mail.

OS: Redhat-7
PostgreSQL Version: 9.5.7


-Pawan



--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



Re: [GENERAL] monitoring PostgreSQL

2017-07-24 Thread Nikolay Samokhvalov
okmeter.io is good (though it's cloud-based and costs some money), I'm
using it and definitely can recommend.

There much more tools, see: https://wiki.postgresql.org/wiki/Monitoring

On Mon, Jul 24, 2017 at 4:27 AM, PAWAN SHARMA 
wrote:

> Hi All,
>
>
> Please provide me a list of tools which we can use for monitoring
> PostgreSQL.
>
> -Monitor all the services and health of server
> -Able to send critical and warning alert on mail.
>
> OS: Redhat-7
> PostgreSQL Version: 9.5.7
>
>
> -Pawan
>


[GENERAL] monitoring PostgreSQL

2017-07-24 Thread PAWAN SHARMA
Hi All,


Please provide me a list of tools which we can use for monitoring
PostgreSQL.

-Monitor all the services and health of server
-Able to send critical and warning alert on mail.

OS: Redhat-7
PostgreSQL Version: 9.5.7


-Pawan


Re: [GENERAL] Dealing with ordered hierarchies

2017-07-24 Thread Alban Hertroys

> On 24 Jul 2017, at 9:02, Tim Uckun  wrote:
> 
> I have read many articles about dealing with hierarchies in postgres 
> including nested sets, ltree, materialized paths, using arrays as parentage,  
> CTEs etc but nobody talks about the following scenario.
> 
> Say I have a hierarchy like this
> 
> 1
> 1.1
> 1.1.1
> 1.1.2
> 1.2
> 1.3
> 2
> 2.1
> 
> In this hierarchy the order is very important and I want to run frequent(ish) 
> re-ordering of both subsets and entire trees and even more frequent inserts.

Since they're hierarchies, the order is already in the structure of the data. 
Do you really need to add it to the data or would it suffice to add it to the 
query result?

If that's the case, you only need a simple ordering number per branch, like 1, 
2, 3, etc. The full path (ie. '1.1.3') gets generated in the query.

I regularly generate structures like your above example using recursive CTE's. 
The "path" helps to get the results in the correct order for starters (although 
you're in for a surprise if any of your levels go past 9 in the above). It's 
great how you can "trickle" all kinds of calculations through the hierarchy 
using CTE's.

Something like this should help to get you started (untested, I usually do this 
in Oracle, which has several peculiarities):

with recursive hierarchy (parent, node, sequence_number, path) as (
select null, node, sequence_number, sequence_number::text from table
union all
select h.node, t.node, t.sequence_number, h.path || '.' || 
t.sequence_number::text
  from table t
  join hierarchy h on (t.parent = h.node)
)
select node, path
  from hierarchy

Where the table "table" has fields:
parent  -- parent node
node-- actual node
sequence_number -- Order of sequence of this node within its parent 
branch

You may need to add a surrogate key if your parent/child combinations are 
otherwise not unique. That would then also be the way to address a node 
directly (otherwise it would be (parent, node)).

For the sequence_number I'd probably just use an actual sequence generator with 
a large enough gap to prevent problems with reordering items later on 
(increment by 10 for example). You will also want to pad the sequence numbers 
in the "path" column with leading zeroes (otherwise 10 sorts between 1 and 2, 
etc.), enough that you won't run out of numbers per level.

If you require your sequence numbers to be subsequent in the result: You can 
add a field with such numbering based on the existing sequence_numbers, by 
using a windowing function in each branch of the union - it's down to a fairly 
basic row numbering problem at this point.

> Scenario 1: I want to insert a child into the 1.1 subtree.  The next item 
> should be 1.1.3 and I can't figure out any other way to do this other than to 
> subquery the children and to figure out the max child ID, add one to it which 
> is a race condition waiting to happen.

You would first need to determine which node is the parent node by traversing 
the hierarchy up to the point of insertion and use the (parent, node) or 
surrogate key fields to append under. Similar to using '1.1', really.

> Scenario 2: I now decide the recently inserted item is the second most 
> important so I reset the ID to 1.1.2 and then increment 1.1.2 (and possibly 
> everything below).  Again this is both prone to race conditions and involves 
> a heavy update.

No need to bother with that (much) with the above approach. And if you do run 
out of gaps, you can fairly simply update all the sequence numbers under the 
same parent without causing concurrency issues and without requiring 
locks/synchronisation.

> Is there a better way to deal with this or is the complexity unavoidable?

I think it's better, but I don't think its ideal. It's fairly complicated to 
understand, for one thing, which can cause problems for maintenance (I have 
colleagues who don't dare to touch my queries, for example).

> I should state that like most database reads will be much more frequent than 
> writes and inserts will be more frequent than updates (re-ordering)

More of the logic (and thus system load) gets moved to the read-side of things, 
that's probably a drawback, but most of it is just keeping state and counting. 
I don't expect that to be all that much.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

2017-07-24 Thread Dmitry Lazurkin
On 07/24/2017 01:40 AM, PT wrote:
> In this example you count approximately 40,000,000 values, which is
> about 40% of the table. 
4 000 000 (:

> If you really need these queries to be faster, I would suggest
> materializing the data, i.e. create a table like:
>
> CREATE TABLE id_counts (
>  id BIGINT PRIMARY KEY,
>  num BIGINT
> )
>
> Then use a trigger or similar technique to keep id_counts in sync
> with the id table. You can then run queries of the form:
>
> SELECT sum(num) FROM id_counts WHERE id IN :values:
>
> which I would wager houseboats will be significantly faster.
I use count only for example because it uses seqscan. I want optimize
IN-clause ;-).

Thanks.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

2017-07-24 Thread Dmitry Lazurkin
On 07/24/2017 01:40 AM, PT wrote:
> In this example you count approximately 40,000,000 values, which is
> about 40% of the table. 
4 000 000 (:

> If you really need these queries to be faster, I would suggest
> materializing the data, i.e. create a table like:
>
> CREATE TABLE id_counts (
>  id BIGINT PRIMARY KEY,
>  num BIGINT
> )
>
> Then use a trigger or similar technique to keep id_counts in sync
> with the id table. You can then run queries of the form:
>
> SELECT sum(num) FROM id_counts WHERE id IN :values:
>
> which I would wager houseboats will be significantly faster.
I use count only for example because it uses seqscan. I want optimize
IN-clause ;-).

Thanks.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dealing with ordered hierarchies

2017-07-24 Thread Achilleas Mantzios

On 24/07/2017 10:02, Tim Uckun wrote:

I have read many articles about dealing with hierarchies in postgres including 
nested sets, ltree, materialized paths, using arrays as parentage,  CTEs etc 
but nobody talks about the following scenario.

Say I have a hierarchy like this

1
1.1
1.1.1
1.1.2
1.2
1.3
2
2.1

In this hierarchy the order is very important and I want to run frequent(ish) 
re-ordering of both subsets and entire trees and even more frequent inserts.

Scenario 1: I want to insert a child into the 1.1 subtree. The next item should be 1.1.3 and I can't figure out any other way to do this other than to subquery the children and to figure out the max 
child ID, add one to it which is a race condition waiting to happen.


Scenario 2: I now decide the recently inserted item is the second most important so I reset the ID to 1.1.2 and then increment 1.1.2 (and possibly everything below).  Again this is both prone to 
race conditions and involves a heavy update.


Is there a better way to deal with this or is the complexity unavoidable?

Maybe you could try a hybrid approach with genealogical paths, represented by 
arrays, and a (possible bidirectional) linked list storing the siblings of the 
same parent.
Basically what you'd normally want is to convert your problem into something 
that can be represented in such a way that it can run fast on postgresql.


I should state that like most database reads will be much more frequent than 
writes and inserts will be more frequent than updates (re-ordering)



--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Dealing with ordered hierarchies

2017-07-24 Thread Tim Uckun
I have read many articles about dealing with hierarchies in postgres
including nested sets, ltree, materialized paths, using arrays as
parentage,  CTEs etc but nobody talks about the following scenario.

Say I have a hierarchy like this

1
1.1
1.1.1
1.1.2
1.2
1.3
2
2.1

In this hierarchy the order is very important and I want to run
frequent(ish) re-ordering of both subsets and entire trees and even more
frequent inserts.

Scenario 1: I want to insert a child into the 1.1 subtree.  The next item
should be 1.1.3 and I can't figure out any other way to do this other than
to subquery the children and to figure out the max child ID, add one to it
which is a race condition waiting to happen.

Scenario 2: I now decide the recently inserted item is the second most
important so I reset the ID to 1.1.2 and then increment 1.1.2 (and possibly
everything below).  Again this is both prone to race conditions and
involves a heavy update.

Is there a better way to deal with this or is the complexity unavoidable?

I should state that like most database reads will be much more frequent
than writes and inserts will be more frequent than updates (re-ordering)