Order of multicolumn gist index

2024-01-09 Thread Paul van der Linden
I have the following query:
SELECT *
FROM polygons
WHERE zoomlevel <= {zoom} AND st_intersects(way,{tileboundary})
For any given tile according to the openstreetmap tiles.
So zoomlevel is from 0..14 and the number of polygons in each level is roughly 
exponential.

Postgres doc (https://www.postgresql.org/docs/current/indexes-multicolumn.html) 
states that
"A GiST index will be relatively ineffective if its first column has only a few 
distinct values, even if there are many distinct values in additional columns."

So I am trying to figure out the difference between the column order in the 
index.
Created following indices:
CREATE INDEX polygon_minzoom_geo ON polygons USING gist (minzoom,way)
CREATE INDEX polygon_minzoom_geo2 ON polygons USING gist (way,minzoom)
And did a test for a zoom=9 tile:
SELECT *
FROM polygons
WHERE zoomlevel <= 9 AND st_intersects(way,'SRID=3857;POLYGON((547900 
6653078,547900 6574807,626172 6574807,626172 6653078,547900 6653078))')

After running that query with either one of the indices disabled (BEGIN; DROP 
INDEX etc) I get the following explain results:
Using (minzoom,way):

Result  (cost=0.42..228992.47 rows=229000 width=113) (actual 
time=149.483..1471.819 rows=42463 loops=1)
  Buffers: shared hit=352653
  ->  ProjectSet  (cost=0.42..4572.47 rows=229000 width=88) (actual 
time=149.461..1234.048 rows=42463 loops=1)
Buffers: shared hit=274281
->  Index Scan using polygon_minzoom_geo on polygons  
(cost=0.42..3283.20 rows=229 width=264) (actual time=149.409..955.849 
rows=42463 loops=1)
  Index Cond: ((minzoom <= 9) AND (way && 
'010320110F01000500BE5C7A45F81B23412C8346BC25615941BE5C7A45F81B2341B99930DBB51459412611CB3C79B82041B99930DBB51459412611CB3C79B820412C8346BC25615941BE5C7A45F81B23412C8346BC25615941'::geometry))
  Filter: ((minzoom <= 9) AND (way && 
'010320110F01000500BE5C7A45F81B23412C8346BC25615941BE5C7A45F81B2341B99930DBB51459412611CB3C79B82041B99930DBB51459412611CB3C79B820412C8346BC25615941BE5C7A45F81B23412C8346BC25615941'::geometry))
  Rows Removed by Filter: 345
  Buffers: shared hit=274025
Planning Time: 7.220 ms
Execution Time: 1494.267 ms


Using (way,minzoom):

Result  (cost=0.42..228992.47 rows=229000 width=113) (actual 
time=178.747..1715.135 rows=42463 loops=1)
  Buffers: shared hit=350570
  ->  ProjectSet  (cost=0.42..4572.47 rows=229000 width=88) (actual 
time=178.731..1436.764 rows=42463 loops=1)
Buffers: shared hit=272198
->  Index Scan using polygon_minzoom_geo2 on polygons  
(cost=0.42..3283.20 rows=229 width=264) (actual time=178.683..1118.691 
rows=42463 loops=1)
  Index Cond: ((way && 
'010320110F01000500BE5C7A45F81B23412C8346BC25615941BE5C7A45F81B2341B99930DBB51459412611CB3C79B82041B99930DBB51459412611CB3C79B820412C8346BC25615941BE5C7A45F81B23412C8346BC25615941'::geometry)
 AND (minzoom <= 9))
  Filter: ((way && 
'010320110F01000500BE5C7A45F81B23412C8346BC25615941BE5C7A45F81B2341B99930DBB51459412611CB3C79B82041B99930DBB51459412611CB3C79B820412C8346BC25615941BE5C7A45F81B23412C8346BC25615941'::geometry)
 AND (minzoom <= 9))
  Rows Removed by Filter: 345
  Buffers: shared hit=271942
Planning Time: 9.427 ms
Execution Time: 1742.729 ms

So all in all not really a big difference.
Is this situation somehow special and thus the remark in the documentation not 
applicable here, or am I missing something in the analysis that would show up 
the difference?

Paul

P.S. when replying, please include me too



Completely wrong queryplan

2022-05-04 Thread Paul van der Linden
Hi,

I'm stumbling on an issue which seems like this one:
https://www.postgresql.org/message-id/20170719152038.19353.71475%40wrigleys.postgresql.org,
but I hope someone can shed some light on my specific case.

Software:
POSTGIS="3.1.4 ded6c34" [EXTENSION] PGSQL="140" GEOS="3.8.0-CAPI-1.13.1 "
PROJ="6.3.1" LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3"
WAGYU="0.5.0 (Internal)"
PostgreSQL 14.0 (Ubuntu 14.0-1.pgdg20.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit

2 relevant tables and indices:
CREATE TABLE osm_current.planet_osm_point
(
  osm_id bigint NOT NULL,
  tags hstore,
  way geometry(Point,3857)
);
CREATE INDEX planet_osm_poi_bigfunc_geo
  ON osm_current.planet_osm_point
  USING gist
  (bigfunc(tags), way)
  WHERE bigfunc(tags) <= 14;

CREATE TABLE osm_current.planet_osm_polygon
(
  osm_id bigint NOT NULL,
  tags hstore,
  way geometry(Polygon,3857)
);
CREATE INDEX planet_osm_polygon_bigfunc_geo
  ON osm_current.planet_osm_polygon
  USING gist
  (bigfunc(tags), way)
  WHERE bigfunc(tags) <= 14;

Query:
SELECT *
FROM osm_current.planet_osm_polygon
WHERE bigfunc(tags) <= 7
  AND NOT EXISTS(
SELECT *
FROM osm_current.planet_osm_point
WHERE bigfunc(planet_osm_point.tags) <= 7
  AND ST_Intersects(planet_osm_point.way,planet_osm_polygon.way)
  AND bigfunc2(planet_osm_point.tags) =
bigfunc2(planet_osm_polygon.tags)
  )
  AND ST_Intersects(
   'SRID=3857;POLYGON((15012477.510296581
3741379.0533562037,15012477.510296581 4398859.837299369,15669958.252794353
4398859.837299369,15669958.252794353 3741379.0533562037,15012477.510296581
3741379.0533562037))'::geometry,
   way)

Normal execution: I canceled it after 1 hour...
Explain gives:
Gather  (cost=22998304.12..81977433.81 rows=2628686 width=262)
  Workers Planned: 2
  ->  Parallel Hash Anti Join  (cost=22997304.12..81713565.21 rows=1095286
width=262)
Hash Cond: (bigfunc2(planet_osm_polygon.tags) =
bigfunc2(planet_osm_point.tags))
Join Filter: st_intersects(planet_osm_point.way,
planet_osm_polygon.way)
->  Parallel Bitmap Heap Scan on planet_osm_polygon
 (cost=51152.38..30790214.58 rows=1096787 width=262)
  Recheck Cond: (bigfunc(tags) <= 7)
  Filter:
st_intersects('010320110F010005007F5954B04FA26C414760D486618B4C417F5954B04FA26C41185096F5C2C750412FE416C858E36D41185096F5C2C750412FE416C858E36D414760D486618B4C417F5954B04FA26C414760D486618B4C41'::geometry,
way)
  ->  Bitmap Index Scan on planet_osm_polygon_bigfunc_geo
 (cost=0.00..50494.31 rows=2632289 width=0)
Index Cond: ((bigfunc(tags) <= 7) AND (way &&
'010320110F010005007F5954B04FA26C414760D486618B4C417F5954B04FA26C41185096F5C2C750412FE416C858E36D41185096F5C2C750412FE416C858E36D414760D486618B4C417F5954B04FA26C414760D486618B4C41'::geometry))
->  Parallel Hash  (cost=22204690.21..22204690.21 rows=23875962
width=126)
  ->  Parallel Bitmap Heap Scan on planet_osm_point
 (cost=309564.90..22204690.21 rows=23875962 width=126)
Recheck Cond: (bigfunc(tags) <= 7)
->  Bitmap Index Scan on planet_osm_poi_bigfunc_geo
 (cost=0.00..295239.32 rows=57302310 width=0)
  Index Cond: (bigfunc(tags) <= 7)

When setting enable_hashjoin to false it gets radically different:
Gather  (cost=52152.79..169588182414.71 rows=2628686 width=262) (actual
time=11.162..1037.116 rows=5381 loops=1)
  Output: planet_osm_polygon.osm_id, planet_osm_polygon.tags,
planet_osm_polygon.way
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=380500 read=5531
  ->  Nested Loop Anti Join  (cost=51152.79..169587918546.11 rows=1095286
width=262) (actual time=2.867..1015.295 rows=1794 loops=3)
Output: planet_osm_polygon.osm_id, planet_osm_polygon.tags,
planet_osm_polygon.way
Buffers: shared hit=380500 read=5531
Worker 0:  actual time=1.168..1011.822 rows=1834 loops=1
  Buffers: shared hit=129515 read=1663
Worker 1:  actual time=1.236..1010.438 rows=1858 loops=1
  Buffers: shared hit=129837 read=1632
->  Parallel Bitmap Heap Scan on osm_current.planet_osm_polygon
 (cost=51152.38..30790214.58 rows=1096787 width=262) (actual
time=1.846..23.809 rows=1853 loops=3)
  Output: planet_osm_polygon.osm_id, planet_osm_polygon.tags,
planet_osm_polygon.way
  Recheck Cond: (bigfunc(planet_osm_polygon.tags) <= 7)
  Filter:
st_intersects('010320110F010005007F5954B04FA26C414760D486618B4C417F5954B04FA26C41185096F5C2C750412FE416C858E36D41185096F5C2C750412FE416C858E36D414760D486618B4C417F5954B04FA26C414760D486618B4C41'::geometry,
planet_osm_polygon.way)
  Heap Blocks: exact=1235
  Buffers: shared hit=9 read=4104
  Worker 0:  actual time=0.135..22.343 rows=1902 loops=1
Buffers: shared hit=2 read=1317
  Worker 

Re: Cannot find hstore operator

2022-01-24 Thread Paul van der Linden
Thanks, works perfectly!

On Sun, Jan 23, 2022 at 4:22 PM Tom Lane  wrote:

> Paul van der Linden  writes:
> > Thanks for the clarification, but giving up performance is a no-go for
> us.
> > Also I have my concerns about shemaqualifying each and every use of the
> ->
> > operator, there are really a lot of them in my functions and it would
> > severely impact readability.
> > Are these the only 2 solutions possible?
>
> As of v14 you could use SQL-style function definitions, so that the
> operator is parsed at function definition time instead of runtime.
>
> regards, tom lane
>


Re: Cannot find hstore operator

2022-01-23 Thread Paul van der Linden
Thanks for the clarification, but giving up performance is a no-go for us.

Also I have my concerns about shemaqualifying each and every use of the ->
operator, there are really a lot of them in my functions and it would
severely impact readability.
Are these the only 2 solutions possible?

Paul

On Thu, Jan 20, 2022 at 3:05 PM Tom Lane  wrote:

> Paul van der Linden  writes:
> > during maintenance I saw a lot of lines in my postgreslog saying:
> > CONTEXT:  SQL function "line_function" during inlining
> > automatic analyze of table "osm.planet_osm_line"
> > ERROR:  operator does not exist: public.hstore -> unknown at character 45
>
> It sounds like line_function is careless about its search path
> assumptions.  auto-analyze will run index expressions with the
> search_path set to empty (i.e., only pg_catalog is accessible)
> and hstore isn't normally installed in pg_catalog.
>
> The easy fix would be to attach "SET search_path = public"
> to that function, but I believe that destroys the ability to
> inline it, which might be a performance problem for you.
> Alternatively you could schema-qualify the operator name,
> that is "foo OPERATOR(public.->) bar".
>
> regards, tom lane
>


Cannot find hstore operator

2022-01-20 Thread Paul van der Linden
Hi,

during maintenance I saw a lot of lines in my postgreslog saying:
CONTEXT:  SQL function "line_function" during inlining
automatic analyze of table "osm.planet_osm_line"
ERROR:  operator does not exist: public.hstore -> unknown at character 45
HINT:  No operator matches the given name and argument types. You might
need to add explicit type casts.

Now the hint gives me an option but I really don't like the sprinkling of
::text in all my functions
When executed (with search_path=public) this function works correctly, and
I doublechecked that all (or at least a lot of them) hstore related
functions are present in the public schema.
Are there any other solutions to this?

Paul

PS please cc me when answering


Re: CTE Materialization

2021-12-09 Thread Paul van der Linden
This one quite nicely explains it:
https://stackoverflow.com/questions/14897816/how-can-i-prevent-postgres-from-inlining-a-subquery

On Wed, Dec 8, 2021 at 3:14 AM David G. Johnston 
wrote:

> On Tue, Dec 7, 2021 at 6:40 PM Дмитрий Иванов 
> wrote:
>
>> I beg your pardon.
>> The problem is more or less clear to me, but the solution is not. What
>> does the "hack is to add an "offset 0" to the query" suggest? Thank you.
>>
>>
> A subquery with a LIMIT clause cannot have where clause expressions in
> upper parts of the query tree pushed down it without changing the overall
> query result - something the planner is not allowed to do.  For the hack,
> since adding an actual LIMIT clause doesn't make sense you omit it, but
> still add the related OFFSET clause so the planner still treats the
> subquery as a LIMIT subquery.  And since you don't want to skip any rows
> you specify 0 for the offset.
>
> David J.
>
>


Re: CTE Materialization

2021-12-06 Thread Paul van der Linden
It did indeed work as expected.
Took the query down from over 18 hours to 20 minutes, so a huge win!

Paul

On Thu, Dec 2, 2021 at 11:34 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Thursday, December 2, 2021, Paul van der Linden <
> paul.doskabou...@gmail.com> wrote:
>
>> Hi,
>>
>> when switching to postgres 14 (from 11) I'm having some slow queries
>> because of inlining of CTE's.
>> I know I can get the same result as with PG11 when adding MATERIALIZED to
>> the cte, but the same application also needs to be able to run on older
>> postgres versions, so that is a no-go.
>> Is there any other way that I can have materialized cte's in PG14 while
>> still be compatible with older PG versions?
>> Much appreciated,
>>
>
> The usual anti-inlining hack is to add an “offset 0” to the query.
> Haven’t tried it in 14 myself though.
>
> David J.
>
>


Re: CTE Materialization

2021-12-06 Thread Paul van der Linden
Thanks a lot, completely forgot that one!
Gonna test that tomorrow...

On Thu, Dec 2, 2021 at 11:34 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Thursday, December 2, 2021, Paul van der Linden <
> paul.doskabou...@gmail.com> wrote:
>
>> Hi,
>>
>> when switching to postgres 14 (from 11) I'm having some slow queries
>> because of inlining of CTE's.
>> I know I can get the same result as with PG11 when adding MATERIALIZED to
>> the cte, but the same application also needs to be able to run on older
>> postgres versions, so that is a no-go.
>> Is there any other way that I can have materialized cte's in PG14 while
>> still be compatible with older PG versions?
>> Much appreciated,
>>
>
> The usual anti-inlining hack is to add an “offset 0” to the query.
> Haven’t tried it in 14 myself though.
>
> David J.
>
>


CTE Materialization

2021-12-02 Thread Paul van der Linden
Hi,

when switching to postgres 14 (from 11) I'm having some slow queries
because of inlining of CTE's.
I know I can get the same result as with PG11 when adding MATERIALIZED to
the cte, but the same application also needs to be able to run on older
postgres versions, so that is a no-go.
Is there any other way that I can have materialized cte's in PG14 while
still be compatible with older PG versions?
Much appreciated,

Paul

PS please cc me when answering


Re: Slow index creation

2021-02-24 Thread Paul van der Linden
Thanks for all the suggestions,

When the server is not in use for mission-critical work, I'll definitely
going to do some testing based on your ideas.
Will let you know what comes out of that

Cheers,
Paul

On Thu, Feb 18, 2021 at 6:49 PM hubert depesz lubaczewski 
wrote:

> On Thu, Feb 18, 2021 at 10:24:25AM -0700, Michael Lewis wrote:
> >   [1]
> https://www.depesz.com/2010/03/18/profiling-stored-proceduresfunctions/
> >
> > Thanks for this reference. I enjoy your blog, but haven't made the time
> to read all the archives somehow. Stuff doesn't stick very
> > well when it isn't yet "needed" info besides.
> > I have seen overhead from 'raise notice' in small functions that are
> sometimes called many thousands of times in a single query, but
> > hadn't done the test to verify if the same overhead still exists for
> raise debug or another level below both client_min_messages
> > and log_min_messages. Using your examples, I saw about .006 ms for each
> call to RAISE DEBUG with a client/log_min as notice/warning.
>
> Sure, this overhead is definitely possible, but kinda besides the point
> - there will be some slowdowns in other places, and it will be good to
> track them.
> That's why I suggested to do it on small sample of data.
>
> Best regards,
>
> depesz
>
>


Re: Slow index creation

2021-02-17 Thread Paul van der Linden
Well, first off it's annoying if I have to change the function and a
reindex afterwards, and secondly, lots of other queries are blocking on
that reindex query (basically everything needing a queryplan on that table).

Materializing is also an option but that too is taking its time.

As far as I know there's no parallelism used currently, and as per
documentation, only creating b-tree indices support parallelism..
Also my postgres installation (on windows) doesn't seem to do anything with
JIT (even after setting all the jit-related values to 0)

I was more trying to get a feeling on where the slowness is, and how to
improve that...

On Tue, Feb 16, 2021 at 7:45 PM Michael Lewis  wrote:

> What is your concern with it taking 20 hours vs 1 hour? Is this index
> re-created on a regular basis?
>
> Would it make any sense to materialize the value of foo(a,b,c) as a
> generated column (PG12+ natively, or maintained by a trigger before)? Or
> even bar(foo(a,b,c),geom)?
>
> Do you know if parallel_workers are being used?
>
> JIT is available in PG11, it is just off by default. If it is available,
> turning it on and trying it seems like the simplest check if it would speed
> up the index creation.
>


Slow index creation

2021-02-16 Thread Paul van der Linden
Hi,

I have 2 functions:
CREATE FUNCTION foo(a text, b text, c text) RETURNS text AS
$func$
DECLARE
retVal text;
BEGIN
SELECT
  CASE
WHEN a='v1' AND b='b1' THEN 'r1'
WHEN a='v1' THEN 'r2'
... snip long list containing various tests on a,b and c
WHEN a='v50' THEN 'r50'
  END INTO retval;
RETURN retVal;
END
$func$ LANGUAGE PLPGSQL IMMUTABLE;

CREATE FUNCTION bar(r text, geom geometry) RETURNS int AS
$func$
DECLARE
retVal int;
BEGIN
SELECT
  CASE
WHEN r='r1' AND st_area(geom)>100 THEN 1
WHEN r='r1' THEN 2
... snip long list containing various tests on r and st_area(geom)
WHEN r='r50' THEN 25
  END INTO retval;
RETURN retVal;
END
$func$ LANGUAGE PLPGSQL IMMUTABLE;

and a large table t (100M+ records) with columns a, b, c and geom running
on PG 11, on spinning disks with 64GB memory and 28 cores.

When I create a simple geom index with CREATE INDEX ON t USING gist(geom)
it finishes in about an hour, but when I create a partial index using these
2 functions
CREATE INDEX ON t USING gist(geom) WHERE bar(foo(a,b,c),geom)<12 it takes
over 20 hours...

Is that because I'm using functions in the WHERE clause, or because my CASE
lists are quite long, or both?
Is there any way to speed up that index creation? Is upgrading to a newer
postgres a viable option (so the JITTER can speed up the functions)?

Paul


Re: Possible improvement

2020-06-05 Thread Paul van der Linden
Ok, as always there's a lot more to take into account then when just
superficially looking at it.
And indeed your counterexample shows that you'd have to include all the
previous when-conditions too as false
WHERE x=0 IS DISTINCT FROM true AND 1/x > 100, which could become quite
messy (especially with nested cases)


On Fri, Jun 5, 2020 at 9:02 PM Tom Lane  wrote:

> Paul van der Linden  writes:
> > For the case where it isn't known if the case expression itself is
> indexed,
> > technically that should be added as a decision-node in the query planner.
>
> That'd be fairly hard to do, if we're regarding this as an expression
> simplification step, since expression simplification is run long before
> any consideration is given to indexes.  (Even if we were willing to
> contemplate reversing that ordering, it'd be hard to do, because we
> need the simplified expressions to compare to index expressions ---
> else we'd get fooled by irrelevant discrepancies that simplification
> is supposed to remove.)
>
> The alternative is to try to wire this into index path generation instead
> of treating it as a general-purpose expression simplification ... but that
> likewise seems pretty undesirable.  If you've got a case like this, you'd
> like it to be simplified whether it ends up as an indexqual or not.
>
> So, as I said, I'm inclined to dismiss David's complaint as an
> impracticable requirement.  The other issues I raised are far more
> significant.
>
> BTW, speaking of correctness, this seems like a pretty dire
> counterexample:
>
> SELECT ... FROM
>(SELECT CASE WHEN x = 0 THEN 'zero'
> WHEN 1/x > 100 THEN 'tiny'
> ELSE 'whatever' END AS class,
>...
>) ss
> WHERE ss.class = 'tiny';
>
> Naive application of this transformation would convert the WHERE to
>
> WHERE 1/x > 100
>
> creating divide-by-zero failures where there should be none.
> I'm not sure how we get around that; in general the planner
> has little clue which operations can throw what errors.
>
> regards, tom lane
>


Re: Possible improvement

2020-06-05 Thread Paul van der Linden
Thanks for your thoughts.

For the case where it isn't known if the case expression itself is indexed,
technically that should be added as a decision-node in the query planner.
After all there are 2 possibilities to handle that so it should be up to
the planner to choose the cheapest.

Having said that, if the time spent planning the query is *that* critical I
agree that it probably isn't worth it.
Just that in my line of work the execution time of a query is a lot of
orders of magnitude larger than the planning time (my recordholder is a
query that runs for just over 3 days...)

On Fri, Jun 5, 2020 at 4:31 PM Tom Lane  wrote:

> David Rowley  writes:
> > On Fri, 5 Jun 2020 at 14:41, Paul van der Linden
> >  wrote:
> >> If I have a query like:
> >>
> >> SELECT * FROM (
> >> SELECT
> >> CASE
> >> WHEN field='value1' THEN 1
> >> WHEN field='value2' THEN 2
> >> END AS category
> >> FROM table1
> >> ) AS foo
> >> WHERE category=1
> >>
> >> doesn't use the index on field, while technically it could do that.
> >> Is it hard to implement drilling down the constant in the WHERE to
> within the CASE?
>
> > It doesn't look impossible to improve that particular case.  See
> > eval_const_expressions_mutator() in clauses.c at T_CaseExpr. However,
> > this would need to take constant folding further than we take it
> > today. Today we just have the ability to simplify expressions which
> > are, by themselves, an expression which will always evaluate to a
> > constant value. This case is more complex as it requires something
> > outside the CASE expr to allow the simplification to take place. In
> > this case, we'd need to look at the other side of the OpExpr to see
> > the const there before any transformation could simplify it.
>
> I'd tend to see this as a transformation rule that acts on equality-
> with-a-CASE-input, thereby avoiding the "action at a distance" problem.
>
> > It's
> > also not entirely clear that the simplification would always be a good
> > idea.  What, for example if there was an index on the case statement
> > but none on "field". The query may perform worse!
>
> FWIW, I'm not too fussed about that objection.  If we rejected new
> optimizations on the basis that somebody's optimized-for-the-old-way
> query might perform worse, almost no planner changes would ever get in.
> I think most people would feel that an optimization like this is an
> improvement.  (I recall coming across a similar case in an
> information_schema query just a few days ago.)  The hard questions
> I would ask are
> 1. Is the transformation actually correct?
> 2. Does it improve queries often enough to be worth the planning cycles
> expended to look for the optimization?
>
> As far as #1 goes, note that this CASE produces NULL if "field" is
> neither 'value1' nor 'value2', whereupon the equality operator would
> also produce NULL, so that simplifying to "field='value1'" is not
> formally correct: that would produce FALSE not NULL for other values
> of "field".  We can get away with the replacement anyway at the top
> level of WHERE, but not in other contexts.  Hence, it'd be wrong to
> try to make this transformation in eval_const_expressions(), which is
> applied to all expressions.  Possibly prepqual.c's canonicalize_qual()
> would be a better place.
>
> The real problem here is going to be objection #2.  The rules under
> which any optimization could be applied are nontrivial, so that we'd
> spend quite a bit of time trying to figure out whether the optimization
> applies ... and I'm afraid that most of the time it would not.
>
> regards, tom lane
>


Possible improvement

2020-06-04 Thread Paul van der Linden
Hi,

Don't know if this already came up earlier but I have an idea for
improvement.

If I have a query like:

SELECT * FROM (
SELECT
  CASE
  WHEN field='value1' THEN 1
  WHEN field='value2' THEN 2
  END AS category
FROM table1
) AS foo
WHERE category=1

doesn't use the index on field, while technically it could do that.
Is it hard to implement drilling down the constant in the WHERE to within
the CASE?
This is especially convenient with views (inner SELECT) where the category
is some complex list of possibilities and you want to filter (outer SELECT)
on specific categories
I know a different solution could be creating an index on that CASE but
(especially if you're experimenting a bit) can be quite cumbersome to
synchronize that with the actual query.

Is this something that could be put on some wishlist? If so where are the
most looked at ones?

Paul

P.S. In replies please use reply to all...


Re: Safe to delete files?

2019-04-13 Thread Paul van der Linden
I did it in pgadmin, without begin commit.


On Sat, Apr 13, 2019 at 4:24 PM Adrian Klaver 
wrote:

> On 4/13/19 7:10 AM, Paul van der Linden wrote:
> > It was just 99 files of 1GB each for each id, and no I didn't vacuum.
> > I did see disk usage dropping quite a lot after dropping those tables
> > though, so I expected postgres to delete all unneccesary files for all
> > the tables.
> >
> > However when checking just now I saw that the files I was referring to
> > were indeed deleted somehow (don't have an autovacuum running).
> > So not sure how or why, but my problem is solved
>
> Did you DROP the tables in a transaction and look at the the directory
> previously before you issued COMMIT?
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: Safe to delete files?

2019-04-13 Thread Paul van der Linden
It was just 99 files of 1GB each for each id, and no I didn't vacuum.
I did see disk usage dropping quite a lot after dropping those tables
though, so I expected postgres to delete all unneccesary files for all the
tables.

However when checking just now I saw that the files I was referring to were
indeed deleted somehow (don't have an autovacuum running).
So not sure how or why, but my problem is solved

On Sat, Apr 13, 2019 at 2:06 AM Adrian Klaver 
wrote:

> On 4/12/19 1:11 PM, Paul van der Linden wrote:
> > Hi,
> >
> > For my process, I needed to drop all the tables in a tablespace except
> > one which I truncated.
> > After that I would have expected to have a couple of KB max in that
> > folder, but there was about 200GB in it.
> >
> > There were 2 sets of files (, .1 .. .99, and the same for
> > id2).
>
> Can you show the actual dir listing?
>
> > Tried the various options from
> > https://blog.2ndquadrant.com/postgresql-filename-to-table/ and oid2name
> > (with -i), to trace it back to a table but all came up empty.
> >
> > Now this folder has a bit of a history spanning several postgres
> > versions and upgrades, and sometime in the past one of the upgrades went
> > horribly wrong, so my first thought was that this was possibly some
> > leftovers from that mishap, but the filetimes were a bit later than that.
> > Also hard to tell because those tables are used as write-once, read-alot
> > so could not base the last usage on filedate.
> >
> > Normally I probably would dare to risk deleting those files, but after
> > the dropping and truncating, the 2 files without extension had the time
> > of drop/truncate and were 0 bytes in length (unfortunately I didn't
> > check the filesize before drop/truncating).
> >
> > Are there other options to see if these files are leftovers from
> > previous stuff and not used by postgres (so i can safely delete them)?
> >
> > Postgres 11, just one used database on it (the other one being a postgis
> > template), running on windows server 2012.
> >
> > In replies please use reply to all...
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Safe to delete files?

2019-04-12 Thread Paul van der Linden
Hi,

For my process, I needed to drop all the tables in a tablespace except one
which I truncated.
After that I would have expected to have a couple of KB max in that folder,
but there was about 200GB in it.

There were 2 sets of files (, .1 .. .99, and the same for
id2).
Tried the various options from
https://blog.2ndquadrant.com/postgresql-filename-to-table/ and oid2name
(with -i), to trace it back to a table but all came up empty.

Now this folder has a bit of a history spanning several postgres versions
and upgrades, and sometime in the past one of the upgrades went horribly
wrong, so my first thought was that this was possibly some leftovers from
that mishap, but the filetimes were a bit later than that.
Also hard to tell because those tables are used as write-once, read-alot so
could not base the last usage on filedate.

Normally I probably would dare to risk deleting those files, but after the
dropping and truncating, the 2 files without extension had the time of
drop/truncate and were 0 bytes in length (unfortunately I didn't check the
filesize before drop/truncating).

Are there other options to see if these files are leftovers from previous
stuff and not used by postgres (so i can safely delete them)?

Postgres 11, just one used database on it (the other one being a postgis
template), running on windows server 2012.

In replies please use reply to all...