Re: [HACKERS] pg_upgrade changes can it use CREATE EXTENSION?

2017-08-30 Thread Regina Obe
Sorry for the cross posting on this one, but I think it's important both groups 
are aware.


>> I think this thread covers most of the issues.
>> https://lists.osgeo.org/pipermail/postgis-devel/2017-August/026355.html
>> My thought was is it possible for pg_upgrade to be taught to use CREATE
>> EXENSION if asked?

> We intentionally *don't* do that; pg_dump goes to a lot of trouble to
> duplicate the old extension contents exactly, instead.  There are a bunch
> of corner cases that would fail if we allowed the new installation to
> have different extension contents than the old.  Believe you me, we'd
> rather have just issued CREATE EXTENSION, but it doesn't work.

> Looking quickly at the thread you cite, I wonder how much of this problem
> is caused by including version numbers in the library's .so filename.

Most of it is.  That's why I proposed at least only bumping on major upgrade.  
So postgis 2.4 so would be called postgis-2.so instead of postgis-2.4.so

We would only change on disk format during major in which case pg_upgrade 
wouldn’t work for folks anyway (such as what happened going from PostGIS 1.5 to 
2.0)  


> Have you considered not doing that?  Our experience with maintaining the
> contrib modules is that it's easier to attach a version number to an
> individual function (in its C name, where it's irrelevant to SQL users).
> If you incompatibly upgrade a given function, you can leave a stub behind,
> with the old C symbol, that does nothing but throw an error if called.
> Or you can keep on supporting the old API if it's easy enough; it
> doesn't have to be a library-wide decision.
> Have you considered not doing that?  Our experience with maintaining the
> contrib modules is that it's easier to attach a version number to an
> individual function (in its C name, where it's irrelevant to SQL users).
> If you incompatibly upgrade a given function, you can leave a stub behind,
> with the old C symbol, that does nothing but throw an error if called.
> Or you can keep on supporting the old API if it's easy enough; it
> doesn't have to be a library-wide decision.

People were all worked up about breaking ABI  and also not being able to run 
two different versions of PostGIS in same cluster.
We rarely break ABI and if we did, like you said it wouldn't kill us
to keep the old C name around until we did a major upgrade.

So I'm all for that idea.  I figure we'll rarely need to do that anyway.

It's mostly PostGIS developers like me that need to run two different versions 
of PostGIS in same cluster mostly for regression testing.
Which is why I proposed having a configure switch which is by default off.

Here is my original vote request.
https://lists.osgeo.org/pipermail/postgis-devel/2017-August/026319.html


> My solution of let's not call it postgis-2.4  but just postgis-2  from
> thenceforward for the life of 2 major series because we don't break backward
> compatibility often in a PostGIS minor version got shot down.

> The thread you mention doesn't seem to include any arguments why not
>  to do that.

 >   regards, tom lane


Some people had issue with trying to do that at PostGIS 2.4 right after we 
already released the alpha and are less than a month away from release.
Though technically we haven't released beta yet so I didn't think it was that 
big of a deal.

But I'm willing to wait for PostGIS 2.5 to appease people.

Tom, as always, thanks for being a voice of reason,

Regina








-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_upgrade changes can it use CREATE EXTENSION?

2017-08-30 Thread Regina Obe
I'm not too familiar with the innards of pg_upgrade, but we've been
discussing it a lot for past couple of days and how it's causing issues for
PostGIS upgrades.

I think this thread covers most of the issues.

https://lists.osgeo.org/pipermail/postgis-devel/2017-August/026355.html

My thought was is it possible for pg_upgrade to be taught to use CREATE
EXENSION if asked? 

Right now we don't support PostgreSQL 11 on PostGIS 2.3 and we really would
like not to because there are too many changes done in 11 that we feel
queezy about backporting.
Even if we did, package maintainers would have to provide 2.3 on 11 and 2.4
on 11 just so people can pg_upgrade to PostgreSQL 11 and then 

ALTER EXTESNION postgis UPDATE;

To postgis 2.4.0

Given that latest PostgreSQL 11 head already doesn't compile against PostGIS
2.4, I'm not confident we can fix 2.4 for 11.  So this will continue to be
more of a problem especially at the rate that PostgreSQL is changing these
days.


Right now crafty users have to do something like this to use pg_upgrade

https://gist.github.com/Komzpa/994d5aaf340067ccec0e

My solution of let's not call it postgis-2.4  but just postgis-2  from
thenceforward for the life of 2 major series because we don't break backward
compatibility often in a PostGIS minor version got shot down.


Any thoughts on this?


Thanks,
Regina Obe
PostGIS PSC member



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 10 changes in exclusion constraints - did something change? CASE WHEN behavior oddity

2017-06-08 Thread Regina Obe

> But this line of thinking does strengthen my feeling that throwing an
error is the right thing to do for the moment.  If we allow v10 to accept
such cases but do something different from what we used to, that 
> will greatly complicate any future attempt to try to restore the old
behavior.

>   regards, tom lane

Agreed.  The other side benefit of throwing an error instead of just doing
something different is you'll find out how rampant the old behavior is :).

People are more likely to know to complain when their apps break than they
are if it just silently starts doing something different.

My main concern in these cases is the short-circuiting not happening.
Because in these cases, the code goes into areas that it shouldn't which is
likely to mess up some logic in hard to troubleshoot ways.
I think erroring out is the best compromise.

Thanks,
Regina



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 10 changes in exclusion constraints - did something change? CASE WHEN behavior oddity

2017-06-07 Thread Regina Obe

> After chewing on this for awhile, I'm starting to come to the conclusion
that we'd be best off to throw an error for SRF-inside-CASE (or COALESCE).
Mark is correct that the simplest case of

>   SELECT x, CASE WHEN y THEN generate_series(1,z) ELSE 5 END
>   FROM table_with_columns_x_and_y_and_z;

> behaves just intuitively enough that people might be using it.  The new
implementation method cannot reasonably duplicate the old semantics for
that, which means that if we let it stand as-is we will be 

> silently breaking queries, even if we fix up some of the weirder corner
cases like what happens when the CASE can be const-simplified.  So I think
we'd be better off to make this throw an error, and force any 
> affected users to rewrite in a way that will work in both v10 and older
releases.

> As to *how* to throw an error, I think it should be possible to teach
parse analysis to detect such cases, with something like the ParseExprKind
mechanism that could be checked to see if we're inside a 
> subexpression that restricts what's allowed.  There are some other checks
like no-nested-aggregates that perhaps could be folded in as well.  Checking
at parse analysis ought to be sufficient because 
> rule rewriting could not introduce such a case where it wasn't before, and
planner subquery flattening won't introduce one either because we don't
flatten subqueries with SRFs in their tlists.

> If people are on board with throwing an error, I'll go see about writing a
patch.

>   regards, tom lane

+1

I'm not a fan of either solution, but I think what Tom proposes of throwing
an error sounds like least invasive and confusing.

I'd much prefer an error thrown than silent behavior change. Given that we
ran into this in 3 places in PostGIS code, I'm not convinced the issue is
all that rare.

Make sure to point out the breaking change in the release notes though and
syntax to remedy it.

Thanks,
Regina 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregation support for aggregate functions that use transitions not implemented for array_agg

2017-06-06 Thread Regina Obe
> On 6/6/17 13:52, Regina Obe wrote:
>> It seems CREATE  AGGREGATE was expanded in 9.6 to support 
>> parallelization of aggregate functions using transitions, with the 
>> addition of serialfunc and deserialfunc to the aggregate definitions.
>> 
>> https://www.postgresql.org/docs/10/static/sql-createaggregate.html
>> 
>> I was looking at the PostgreSQL 10 source code for some example usages 
>> of this and was hoping that array_agg and string_agg would support the 
>> feature.

> I'm not sure how you would parallelize these, since in most uses you want to 
> have a deterministic output order.

>-- 
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Good point.  If that's the reason it wasn't done, that's good just wasn't sure.

But if you didn't have an ORDER BY in your aggregate usage, and you did have 
those transition functions, it shouldn't be any different from any other use 
case right?
I imagine you are right that most folks who use array_agg and string_agg 
usually combine it with array_agg(... ORDER BY ..)

My main reason for asking is that most of the PostGIS geometry and raster 
aggregate functions use transitions and were patterned after array agg.

In the case of PostGIS the sorting is done internally and really only to 
expedite take advantage of things like cascaded union algorithms.  That is 
always done though (so even if each worker does it on just it's batch that's 
still better than having only one worker).
So I think it's still very beneficial to break into separate jobs since in the 
end the gather, will have  say 2 biggish geometries or 2 biggish rasters to 
union if you have 2 workers which is still better than having a million 
smallish geometries/rasters to union

Split Union 

Worker 1:

Parallel agg (internal sort geoms by box)  - Union

Worker 2: 
Parallel Agg (internal sort geoms )  - Union


Gather  Union(union, union) internal sort.


Thanks,
Regina






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Parallel Aggregation support for aggregate functions that use transitions not implemented for array_agg

2017-06-06 Thread Regina Obe
It seems CREATE  AGGREGATE was expanded in 9.6 to support parallelization of
aggregate functions using transitions, with the addition of serialfunc and
deserialfunc to the aggregate definitions.

https://www.postgresql.org/docs/10/static/sql-createaggregate.html

I was looking at the PostgreSQL 10 source code for some example usages of
this and was hoping that array_agg and string_agg would support the feature.
At a cursory glance, it seems they do not use this.
Examples I see that do support it are the average and standard deviation
functions.

Is there a reason for this or it just wasn't gotten to?


Thanks,
Regina





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 10 changes in exclusion constraints - did something change? CASE WHEN behavior oddity

2017-05-26 Thread Regina Obe

> "Regina Obe" <l...@pcorp.us> writes:
>> I figured out the culprit was the change in CASE WHEN behavior with 
>> set returning functions Had a criteria something of the form:
>> CASE WHEN some_condition_dependent_on_sometable_that_resolves_to_false 
>>  THEN (regexp_matches(...))[1] ELSE ...  END FROM sometable;

> You might want to consider changing such usages to use regexp_match()
instead of regexp_matches().

>   regards, tom lane

Thanks.  I ended up swapping out with substring which was a bit shorter than
regexp_match()[].

But I've got similar problems with PostGIS topology logic and the easiest
change to make was take advantage
of the fact that you guys are treating CASE constant ... THEN  SRF ...

Differently 

Than 

CASE not_constant_based_on_table_value THEN  SRF ..


So I switched those to constant checks.  This feels a little dirty and
fragile to me though.

Is this behavior going to stay or change?

It seems inconsistent from a user perspective that

CASE constant  == short-circuit skipping over SRFs that may  otherwise
fail

While 

CASE not_constant_table_dependent  doesn't short-circuit.

I can understand the motive behind it, it just feels a little inconsistent
from an end-user POV.


Thanks,
Regina



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 10 changes in exclusion constraints - did something change? CASE WHEN behavior oddity

2017-05-25 Thread Regina Obe

> Did something change with how exclusion constraints are handled?  I'm
trying to troubleshoot a regression we are having with PostGIS raster
support.

> As best I can guess, it's because exclusion constraints that used to work
in past versions are failing in PostgreSQL 10 with an error something like
> this:

> ERROR:  conflicting key value violates exclusion constraint
"enforce_spatially_unique_test_raster_columns_rast"
> ERROR:  new row for relation "test_raster_columns" violates check
constraint "enforce_coverage_tile_rast"

> Unfortunately I don't know how long this has been an issue since we had an
earlier test failing preventing the raster ones  from being tested.

> Thanks,
> Regina


I figured out the culprit was the change in CASE WHEN behavior with set
returning functions

Had a criteria something of the form:

CASE WHEN some_condition_dependent_on_sometable_that_resolves_to_false THEN
(regexp_matches(...))[1] ELSE ...  END
FROM sometable;


One thing that seems a little odd to me is why these return a record


SELECT CASE WHEN strpos('ABC', 'd') > 1 THEN (regexp_matches('a (b) c',
'd'))[1] ELSE 'a' END;

SELECT CASE WHEN false THEN (regexp_matches('a (b) c', 'd'))[1] ELSE 'a' END
FROM pg_tables;



And this doesn't - I'm guessing it has to do with this being a function of
the value of table, but it seems unintuitive 
>From a user perspective.

SELECT CASE WHEN  strpos(f.tablename, 'ANY (ARRAY[') > 1 THEN
(regexp_matches('a (b) c', 'd'))[1] ELSE 'a' END
FROM pg_tables AS f;


Pre-PostgreSQL 10 this would return a row for each record in pg_tables



Thanks,
Regina



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PostgreSQL 10 changes in exclusion constraints - did something change?

2017-05-25 Thread Regina Obe
Did something change with how exclusion constraints are handled?  I'm trying
to troubleshoot a regression we are having with PostGIS raster support.

As best I can guess, it's because exclusion constraints that used to work in
past versions are failing in PostgreSQL 10 with an error something like
this:

ERROR:  conflicting key value violates exclusion constraint
"enforce_spatially_unique_test_raster_columns_rast"
ERROR:  new row for relation "test_raster_columns" violates check constraint
"enforce_coverage_tile_rast"

Unfortunately I don't know how long this has been an issue since we had an
earlier test failing preventing the raster ones  from being tested.

Thanks,
Regina





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What is "index returned tuples in wrong order" for recheck supposed to guard against?

2017-01-02 Thread Regina Obe

>> If things are out of order, why isn't just going to was_exact = false 
>> good enough?
>>
>> I'm not sure if the mistake is in our PostGIS code or something in 
>> PostgreSQL recheck logic.
>> If I change the elog(ERROR ...) to a elog(NOTICE, the answers  are 
>> correct and sort order is right.
>>
>> Under what conditions would cmp return less than 0?  I tried following 
>> the code in cmp_orderbyvals, but got lost and trying to put elog 
>> notices in to see what the distance is returning (I probably did it 
>> wrong), just ended up crashing by backend.

> cmp would return 0 if the estimated distance returned by the index AM were 
> greater than the actual distance.  
> The estimated distance can be less than the actual distance, but it isn't 
> allowed to be more.  See gist_bbox_distance for an example of a "lossy" 
> distance calculation, and more generally "git show 
> 35fcb1b3d038a501f3f4c87c05630095abaaadab".

>--
>Robert Haas
> EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

Did you mean would return < 0 ? 

Since I thought 0 meant exact and not where it's Erroring?

I think for points then maybe we should turn it off, as this could just be 
floating point issues with the way we compute the index.
That would explain why it doesn't happen for other cases like  polygon / point 
in our code
or polygon /polygon in our code since the box box distance in our code would 
always be <= actual distance for those.

So maybe the best course of action is just for us inspect the geometries and if 
both are points just disable recheck.

It's still not quite clear to me even looking at that git commit, why those 
need to error instead of going thru recheck aside from efficiency.


Thanks,
Regina



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] What is "index returned tuples in wrong order" for recheck supposed to guard against?

2016-12-29 Thread Regina Obe
I've been trying to troubleshoot the cause of this PostGIS recheck bug we
have reported by two people so far.  The last test was a nice simple
repeatable one that triggered the issue:

https://trac.osgeo.org/postgis/ticket/3418


from what I have seen this only affects cases where we are doing a distance
check between two points, which we actually don't need to enable recheck for
anyway, but trying to disable that seems like just shoving the real problem
under the covers.
Where it errors is this line 272 in src/backend/executor/nodeIndexscan

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/exe
cutor/nodeIndexscan.c;h=3143bd94ec4499fba94b41693538b785c4b32e6c;hb=HEAD#l27
2


  /*
 259  * Was the ORDER BY value returned by the index accurate?
The
 260  * recheck flag means that the index can return inaccurate
values,
 261  * but then again, the value returned for any particular
tuple
 262  * could also be exactly correct.  Compare the value
returned by
 263  * the index with the recalculated value.  (If the value
returned
 264  * by the index happened to be exact right, we can often
avoid
 265  * pushing the tuple to the queue, just to pop it back out
again.)
 266  */
 267 cmp = cmp_orderbyvals(node->iss_OrderByValues,
 268   node->iss_OrderByNulls,
 269   scandesc->xs_orderbyvals,
 270   scandesc->xs_orderbynulls,
 271   node);
 272 if (cmp < 0)
 273 elog(ERROR, "index returned tuples in wrong order");
 274 else if (cmp == 0)
 275 was_exact = true;
 276 else
 277 was_exact = false;

If things are out of order, why isn't just going to was_exact = false good
enough?

I'm not sure if the mistake is in our PostGIS code or something in
PostgreSQL recheck logic.
If I change the elog(ERROR ...) to a elog(NOTICE, the answers  are correct
and sort order is right.

Under what conditions would cmp return less than 0?  I tried following the
code in cmp_orderbyvals, but got lost
and trying to put elog notices in to see what the distance is returning (I
probably did it wrong), just ended up crashing by backend.


Thanks for any thoughts,
Regina





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PostgreSQL Version 10, missing minor version

2016-08-28 Thread Regina Obe
The routine in PostGIS to parse out the version number from pg_config is
breaking in the 10 cycle.

Issue seems to be because there is no minor specified.

e.g.

pgconfig --version 

returns:

PostgreSQL 10devel

Instead of expected

PostgreSQL 10.0devel

Is this the way it's going to be or will there be a .0 tacked at the end
before release?

Thanks,
Regina



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there a way around function search_path killing SQL function inlining?

2016-08-10 Thread Regina Obe

> Michael Banck  writes:
>> As I've been bitten by this problem recently, I thought I'd take a 
>> look at editing the PostGIS extension SQL file to this end, but 
>> contrary to the above, the @extschema@ feature only applies to 
>> non-relocatable extensions, from src/backend/commands/extension.c:

>>   * If it's not relocatable, substitute the target schema name for
>>  * occurrences of @extschema@.
>>   *
>>   * For a relocatable extension, we needn't do this.  There cannot be
>>   * any need for @extschema@, else it wouldn't be relocatable.

>> I'm not sure that logic is sound - even if setting @extschema@ 
>> explicitly in the SQL functions bodies kills inlining (not sure about
>> that) or wouldn't help for other reasons, ISTM this should be 
>> reconsidered in the light of the use case with materialized views 
> > during restore.

> It's not simply a matter of allowing the substitution to occur while
reading the extension script.  "Relocatable" means that we support ALTER
EXTENSION SET SCHEMA, which means moving all the 
> extension's objects into some new schema.  There's no good way to run
around and find places where @extschema@ was replaced in order to change
them to something else.

> Basically the point of @extschema@ is to support extensions that are
relocatable at installation time, but not afterwards.

>   regards, tom lane

FWIW on upcoming PostGIS 2.3, we have changed to not allow PostGIS to be
relocatable and schema qualifying internal calls. I took Tom's suggestion of
just using @extschema@
Which did mean we needed to not allow PostGIS to be relocatable anymore.  A
bit of a bummer.

Setting search_path on functions aside from killing inlining also killed
performance in other ways so that was a no go. Not sure if that is a known
issue or not and I haven't determined under what circumstances setting
search_path kills performance when index usage does not come into play.
I'll take it as a known.
Here is an example of such a case. 

https://trac.osgeo.org/postgis/ticket/3611

Now getting to the fact that using @extschema@ means requiring extension not
to be relocatable, that was a bummer and something we would need to deal
with if we ever forced everyone to install PostGIS in a specific schema so
that other extensions that rely on us can just know where PostGIS is
installed (or as Steve Frost suggested a way for dependency extensions to be
able to specify location of dependent extensions with a code such as
@extschema_postgis@ as we've got a bunch of extensions we are aware of
relying on postgis already (pgrouting, postgis_sfcgal, postgis_topology,
postgis_tiger_geocoder)

It would also be nice if the extension model had a way to allow the
extension authors the choice of handling the 'ALTER EXTENSION SET SCHEMA'
event short of monkeying with event triggers.

Yes we really need an extensions authors list to iron out and hear about
these pain points.  :)

Thanks,
Regina



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can we amend gitignore so git postgresql works with git on windows using Msys/Mingw64

2016-03-26 Thread Regina Obe

> You can change the setting with:
> git config --global core.autocrlf input

> Still, it's fair to wonder if we shouldn't add an entry for this to our
.gitattributes.  I'm actually wondering why we wouldn't apply it to ALL text
files in git, not just the extensions Regina mentioned.

> regards, tom lane

David,

Thanks for the tip.  I'll give that a try.  Still I think it's safer to use
.gitattributes.

to Tom's comment,
Here is why.  For certain repos, certain files are designated for certain
OS.  So you don't want your developer committer's settings controlling the
output.

Take for example windows specific files.  I do have lines like

*.vcprojeol=crlf
*.bat   eol=crlf


I don't actually think it much matters even for those if they have LF,
except for batch scripts it looks a bit wonky in some windows editors if
they don't have CRLF breaks.


I'm also not clear what Mac does.  Does Mac still do CR (instead of LF, I
thought it used to).  Does that mean if they switch their ending I'd
suddenly be getting Mac endings?

https://help.github.com/articles/dealing-with-line-endings/

As that above links says - "This file is committed into the repository and
overrides an individual's core.autocrlf setting, ensuring consistent
behavior for all users, regardless of their Git settings." 

I think consistency is more important in this case.


Thanks,
Regina






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] If a schema is created as part of an extension, should all user created tables created in that schema be considered part of the extension?

2016-03-26 Thread Regina Obe

> I think the chain of events is that the tiger_data schema is marked as not
to be backed up (because it belongs to an extension) and then all of its
tables are marked as not to be backed up because they're in a schema that's
not to be backed up.  The latter 
> behavior is meant to implement --exclude-schema but it's firing on this
case too.  I think it might've behaved differently before c42df2d46.

> The whole idea of non-extension objects in a schema owned by an extension
seems pretty grotty to me though; that would mean that dropping the
extension forces dropping those objects, which I wouldn't think you want.
> So I'm not sure it's worth complicating matters to make this case behave
differently in pg_dump.

>   regards, tom lane

Ah indeed if I try my pg_dump 9.5beta1 executable it does backup the tables
so hasn't always been this way.

Are we on the same page, pg_dump should have the same respect for user
created data as the extension model does even if ischema is created by the
extension? I read your last statement two different ways. Sorry.

I was also wrong I can't choose to backup a table from that schema.  Not
sure what I was doing before, so that's not as inconsistent as I thought.


Some more foods for thought:

I guess what I find a little confusing is the schema created by the
extension to install extension functions/tables etc  is not considered part
of the extension.

For example this postgis_tiger_geocoder  is not relocatable.  So it installs
things in schema called tiger which is automatically created by create
extension via extension mechanism.  So when I drop the extension the schema
is still there.

I also can as a user then create tables, functions whatever in tiger schema
and those get backed up.

So as a user, why would I expect schema tiger_data to be any different from
schema tiger that actually holds all tiger installed stuff?

I can understand multiple extensions may install things in same schema even
if they are not schema relocateable so there is a reason for that, but still
it feels a bit odd that we are treating tiger_data differently.

I should also add, that if I do put things in tiger_data and then try to
drop the extension, the extension refuses to drop without cascade as it
should because those objects aren't part of the extension. So why should
pg_dump not have the same respect for user created data?


Thanks,
Regina




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] If a schema is created as part of an extension, should all user created tables created in that schema be considered part of the extension?

2016-03-25 Thread Regina Obe
I just discovered something which was a little alarming to me.

In the postgis_tiger_geocoder extension, I had switched to having the schema
where user data download is stored created as part of create extension
script so I wouldn't need to check during load.

So I have a statement like this in the extension script:

DO 
$$
BEGIN
  IF NOT EXISTS (SELECT * FROM information_schema.schemata WHERE schema_name
= 'tiger_data') THEN
   CREATE SCHEMA tiger_data; 
  END IF;   
END 
$$ LANGUAGE 'plpgsql';


I usually exclude backing the tiger_data schema up in my regular backups
because it changes only yearly and takes up a good chunk of space, so this
is why I probably never noticed this issue.

When I recently did a full database backup (not excluding tiger_data), I
discovered that none of the tables were in the backup file.
What's interesting is I can explicitly choose to backup a single user table,
but I can never backup the whole schema.

To allow backup I had to do

ALTER EXTENSION postgis_tiger_geocoder DROP SCHEMA tiger_data;

I'm going to fix this on my end, by just dropping tiger_data schema from the
extension as part of the install script if it is part of the extension.  The
behavior seems a little odd to me though.

This was testing on - PostgreSQL 9.5.1, compiled by Visual C++ build 1800,
64-bit .

I should add, the user tables in tiger_data do inherit from skeletons in
tiger schema.  Though given that dropping the schema fixes the issue, I
didn't think that was related, but I didn't verify.


Thanks,
Regina




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Can we amend gitattributes so git postgresql works with git on windows using Msys/Mingw64

2016-03-25 Thread Regina Obe

Typo in my last subject line - meant gitattributes. So should have lines
added like below


*.sheol=lf
*.ineol=lf
*.h.in  eol=lf
*.h eol=lf



Thanks,
Regina




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Can we amend gitignore so git postgresql works with git on windows using Msys/Mingw64

2016-03-25 Thread Regina Obe
Tom et. al,

Thanks for fixing the SRF function order by thing.  That test now passes
now, but there is one other test failing I was going to troubleshoot.

The problem is I can't compile from git postgresql, and the issue is I think
because my git when it pulls down the files they come in as CRLF which gets
all screwed up when compiling under msys/ mingw64.
So I end up with non-sense messages like this:

In file included from libpq-int.h:44:0,
 from fe-auth.h:18,
 from fe-auth.c:42:
../../../src/include/libpq/pqcomm.h:44:8: error: redefinition of 'struct
sockaddr_storage'
 struct sockaddr_storage
^
In file included from ../../../src/include/pg_config_os.h:37:0,
 from ../../../src/include/c.h:101,
 from ../../../src/include/postgres_fe.h:25,
 from fe-auth.c:23:
c:/ming64gcc48/mingw64/x86_64-w64-mingw32/include/winsock2.h:260:10: note:
originally defined here
   struct sockaddr_storage {
  ^
In file included from libpq-int.h:44:0,
 from fe-auth.h:18,
 from fe-auth.c:42:
../../../src/include/libpq/pqcomm.h:49:3: error: unknown type name 'int64'
   int64  ss_align; /* ensures struct is properly aligned */


For other projects I work on, I usually put in the .gitattributes these
lines

*.sheol=lf
*.ineol=lf
*.h.in  eol=lf
*.h eol=lf


Right now to overcome the issue, I always have to run the folder thru
dos2unix or just build from the tar ball instead.

If you don't think it would pose problems for anyone else, that would be
helpful.

I imagine the only people it might are MS VS people, but then they probably
have no issue building from tar ball, which comes out all line feed anyway.
So I assume it wouldn't be an issue for them either.


Thanks,
Regina




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Regina Obe

> I'm something of a backwards compatibility zealot, but I've become one for 
> very good reasons.  Personally, I'd rather we'd define precisely the usages 
> that are deprecated (I guess SRF-tlist in the presence of
> FROM) and force them to error out with an appropriate HINT rather than give a 
> different answer than they used to.  The problem here is that LATERAL is 
> still fairly new and there is a huge body of code out there leveraging the 
> 'bad' way, as it was for years > and years the only way to do a number of 
> useful things.

> merlin

FWIW: I prefer Merlin's solution of erroring out rather than returning an 
unexpected answer and if it's a buggy behavior it should be eradicated.

The reason is this.  For many  the (..).* ORDER BY .. looks equivalent to the 
lateral.  
More than a trivial amount of my time has been spent explaining to people why 
their raster queries are so slow because the SRF is called multiple times and 
they should switch to LATERAL usage.

So if the old solution is still going to have the same penalty's I would much 
assume just scrap it and break people's code in a clear and noticeable way they 
can't ignore.

There is nothing more frustrating than code that still works but gives you an 
answer different than what you are expecting.  Those kind of bugs stay buried 
for a while.

I think as long as it's noted in the release notes of the breaking change it's 
fine and called for if it makes your code cleaner and more manageable.

Thanks,
Regina




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Regina Obe
In the past couple of weeks our PostGIS tests against PostgreSQL 9.6 dev
started failing.  I traced the issue down to a behavior change in 9.6 when
dealing with output of set returning functions when used with (func).*
syntax.

Here is an example not involving PostGIS.  Is this an intentional change in
behavior?

CREATE OR REPLACE FUNCTION dumpset(param_num integer, param_text text)
RETURNS TABLE(id integer, junk1 text, junk2 text)
AS
$$
BEGIN
 RETURN QUERY SELECT id2 As id, $1 || $2::text As junk1, $1 || id2::text AS
junk2
FROM generate_series(1,2) As id2;
END;

$$
language 'plpgsql';

-- Get 16 rows in 9.6, Get 8 rows in 9.5
SELECT (dumpset(f.test, 'hello world' || f.test)).*
FROM generate_series(1,4) As f(test)
ORDER BY junk2;


I know that functions get called multiple times with (..).* and so it's
frowned upon, but before the results would only return once and I suspect
for people who are lazy and also don't mind the penalty cost they might just
use this syntax.
If its intentional I can change the tests to follow the best practice
approach.

I think the tests started failing around March 8th which I thought might
have to do with this commit: 9118d03a8cca3d97327c56bf89a72e328e454e63
(around that time) 
When appropriate, postpone SELECT output expressions till after ORDER
BY.
It is frequently useful for volatile, set-returning, or expensive
functions in a SELECT's targetlist to be postponed till after ORDER BY
and LIMIT are done.

Which involved change in output sort.

I'm not absolutely sure if this has to do with that commit, because we had
another test failing (where the order of the result changed, and putting in
an order by fixed that test). 



Thanks,
Regina






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there a way around function search_path killing SQL function inlining? - and backup / restore issue

2016-03-12 Thread Regina Obe

>> On 3/10/16 3:29 PM, Regina Obe wrote:
>> Take for example, I have tiger geocoder which relies on fuzzystrmatch.  I 
>> have no idea where someone installs fuzzystrmatch so I can't schema qualify 
>> those calls.  I use that dependent function to use to build an index on 
>> tables.

> This is something I've thought about as well, and I think the real problem is 
> search_path just isn't the right way to handle this. I think there needs to 
> be some way to definitively reference something that's part of an extension; 
> a method 
> that doesn't depend on whatever schema the extension happens to be installed 
> in.
--
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in 
> Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in 
> Treble! http://BlueTreble.com

I like that idea a lot though that sounds like something that requires a lot 
more work.  In the long run it would be good though especially since I expect 
more and more extensions will rely on each other.

I have similar concerns with pgRouting which I am a member of dev team too, and 
pgRouting  can't schema qualify any of the PostGIS calls because they have no 
idea where PostGIS is installed and the extension model as it stands
doesn't have provisions for referencing dependent extension locations.  That 
hasn't been a major  issue yet since pgRouting doesn't build functions that 
wrap PostGIS for indexing etc.  it is however more of a future concern and is a 
concern for people who build materialized views using pgRouting functions since 
all of those use PostGIS heavily.

There is even if we do that the case of people just building their own 
functions untop of other things.  I guess that one is not as much of a concern 
since they would generally know where their dependent functions are installed 
and can schema qualify.

Thanks,
Regina




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there a way around function search_path killing SQL function inlining? - and backup / restore issue

2016-03-10 Thread Regina Obe

> Hmm.  The meaning of funcs.inline depends on the search_path, not just during 
> dump restoration but all the time.  So anything uses it under a different 
> search_path setting than the normal one will have this kind of problem; not 
> just 

> dump/restore.

> I don't have a very good idea what to do about that.

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

I wasn't suggesting it was a restore only issue, but it's most felt when your 
data doesn't come back.  It affects any extension that relies on another 
extension.

Take for example, I have tiger geocoder which relies on fuzzystrmatch.  I have 
no idea where someone installs fuzzystrmatch so I can't schema qualify those 
calls.  I use that dependent function to use to build an index on tables.

The indexes don't come back.  What I was trying to suggest (side topic, forget 
about inline issue), 

Is the pg_dump should have a switch to allow users to tack on extra schemas

So that the dump restore set search_path thing looks like:

Set search_path=my_data_schema, pg_catalog, whatever_otehr_schemas_I_have_for_db

People can choose to use that switch or not.  So that way if people do have 
database search_paths, they normally run with, their data will come back.

Am I missing something here in this suggestion?  It's one of the most common 
complaints I hear about PostgreSQL in general and the crazy things people do to 
get around the issue like doing plain text dumps and parsing the dump.

Thanks,
Regina




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there a way around function search_path killing SQL function inlining?

2016-03-10 Thread Regina Obe


-Original Message-
> From: Andreas Karlsson [mailto:andr...@proxel.se] 
> Sent: Tuesday, March 08, 2016 10:43 PM
> To: Regina Obe <l...@pcorp.us>; 'Robert Haas' <robertmh...@gmail.com>
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Is there a way around function search_path killing SQL 
> function inlining?


> Hi,

> I think Robert was asking for a test case for the database restore problems.

> The reason your no_inline() function cannot be inlined is due to lack of 
> support of inlining of any functions which have any config variable set, not 
> matter which. The search_path does not get any special treatment, and I am 
> not sure if it
> could in the general case since the new search path will apply too to 
> functions called by the function which changed the search path.

> Andreas

Restore has been an issue since as far back as I can remember.  It's more of an 
issue now now that people are using materialized views and raster constraints.
Anytime you have a materialized view or check constraint on a table that  uses 
a function that calls a non-schema qualified function you have a problem.

For a simple example lets say you created a database like this:
-- code start here --
CREATE DATABASE test;
ALTER DATABASE test
  SET search_path = public,funcs;

\connect test;
CREATE SCHEMA funcs;
CREATE OR REPLACE FUNCTION funcs._helper(box, box) RETURNS float8 AS
$$
  SELECT box_distance($1,$2);
$$
language 'sql' IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION funcs.inline(box,box) RETURNS boolean AS
$$
 SELECT $1 && $2 AND _helper($1,$2) = 0;
$$
language 'sql' IMMUTABLE;


CREATE TABLE bag_boxes(id serial primary key, geom box);
CREATE INDEX idx_bag_boxes_geom ON bag_boxes USING gist(geom);

INSERT INTO bag_boxes(geom)
SELECT ('((' || i::text || ',' || j::text || '), (' || k::text || ', ' || 
l::text || '))')::box
FROM generate_series(1,10) i , generate_series(11,20) j, generate_series(5,10) 
k, generate_series(10, 15) l ;


CREATE MATERIALIZED VIEW vw_bag_boxes AS
SELECT *
FROM bag_boxes 
WHERE funcs.inline('((1,2),(3,4))'::box, geom);

-- code end here --


When you back up the database, it would create a backup with this line:

SET search_path = public, pg_catalog;
--your create materialized view here

When you restore even if your database has search_paths set, your materialized 
view will not come back and will error out with:

ERROR:  function _helper(box, box) does not exist
LINE 2:  SELECT $1 && $2 AND _helper($1,$2) = 0;
 ^
HINT:  No function matches the given name and argument types. You might need to 
add explicit type casts.
QUERY:
 SELECT $1 && $2 AND _helper($1,$2) = 0;

In the case of table constraints, if you have any that rely on functions like 
this, your data fails validation so will not come back.

Ideally it would be nice if pg_dump allowed specifying additional schemas to 
add to the search_path.

We have a similar issue with Foreign tables, but that's probably a harder one 
to fix.

Anyway it seems I underestimated the many ways setting search path on functions 
(even ones that don't rely on anything else as far as I can tell) screws up 
performance
Even when it doesn't affect index usage so that has to be done with caution I 
guess.

Thanks,
Regina











-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there a way around function search_path killing SQL function inlining?

2016-03-07 Thread Regina Obe
>> On Fri, Mar 4, 2016 at 9:29 PM, Regina Obe <l...@pcorp.us>> wrote:
>> I think the answer to this question is NO, but thought I'd ask.
>>
>> A lot of folks in PostGIS land are suffering from restore issues, 
>> materialized view issues etc. because we have functions such as
>>
>> ST_Intersects
>>
>> Which does _ST_Intersects  AND &&
>>
>> Since _ST_Intersects is not schema qualified, during database restore 
>> (which sets the schema to the table or view schema), materialized 
>> views that depend on this do not come back.

> Could you provide a self-contained, reproducible test case that illustrates 
> this problem?  Ideally, one that doesn't involve installing PostGIS?

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

Here is a script just involving the built in geometric types that has the same 
issue:


-- script starts here --
CREATE schema funcs;

set search_path=public,funcs;
CREATE OR REPLACE FUNCTION funcs._helper(box, box) RETURNS float8 AS
$$
  SELECT box_distance($1,$2);
$$
language 'sql' IMMUTABLE STRICT;


CREATE OR REPLACE FUNCTION funcs.no_inline(box,box) RETURNS boolean AS
$$
 SELECT $1 && $2 AND _helper($1,$2) = 0;
$$
language 'sql' IMMUTABLE;

--doing this kills inlining
ALTER FUNCTION funcs.no_inline(box, box) SET search_path=funcs;


--this one doesn't have search_path set so inlining works
CREATE OR REPLACE FUNCTION funcs.inline(box,box) RETURNS boolean AS
$$
 SELECT $1 && $2 AND _helper($1,$2) = 0;
$$
language 'sql' IMMUTABLE;


CREATE TABLE bag_boxes(id serial primary key, geom box);
CREATE INDEX idx_bag_boxes_geom ON bag_boxes USING gist(geom);

INSERT INTO bag_boxes(geom)
SELECT ('((' || i::text || ',' || j::text || '), (' || k::text || ', ' || 
l::text || '))')::box
FROM generate_series(1,10) i , generate_series(11,20) j, generate_series(5,10) 
k, generate_series(10, 15) l ;


SELECT b1.id, b2.id As id2
FROM bag_boxes AS b1 INNER JOIN bag_boxes As b2 ON no_inline(b1.geom, b2.geom);

-- plan looks like this -- PostgreSQL 9.5.1, compiled by Visual C++ build 1800, 
32-bit
/** Nested Loop  (cost=0.00..3402141.00 rows=432 width=8)
  Join Filter: no_inline(b1.geom, b2.geom)
  ->  Seq Scan on bag_boxes b1  (cost=0.00..66.00 rows=3600 width=36)
  ->  Materialize  (cost=0.00..84.00 rows=3600 width=36)
->  Seq Scan on bag_boxes b2  (cost=0.00..66.00 rows=3600 width=36) **/



SELECT b1.id, b2.id As id2
FROM bag_boxes AS b1 INNER JOIN bag_boxes As b2 ON inline(b1.geom, b2.geom);


-- plan looks like this PostgreSQL 9.5.1, compiled by Visual C++ build 1800, 
32-bit
/** Nested Loop  (cost=0.15..2359.00 rows=324 width=8)
  ->  Seq Scan on bag_boxes b1  (cost=0.00..66.00 rows=3600 width=36)
  ->  Index Scan using idx_bag_boxes_geom on bag_boxes b2  (cost=0.15..0.63 
rows=1 width=36)
Index Cond: (b1.geom && geom)
Filter: (box_distance(b1.geom, geom) = '0'::double precision) **/

-- end script --

Thanks,
Regina







-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Is there a way around function search_path killing SQL function inlining?

2016-03-04 Thread Regina Obe
I think the answer to this question is NO, but thought I'd ask. 

A lot of folks in PostGIS land are suffering from restore issues,
materialized view issues etc. because we have functions such as

ST_Intersects

Which does _ST_Intersects  AND && 

Since _ST_Intersects is not schema qualified, during database restore (which
sets the schema to the table or view schema), materialized views that depend
on this do not come back.
It's also a serious issue with raster, though that one can be fixed by
setting search_path since the issue there doesn't use SQL inlining.

So I had this bright idea of setting the search_path of the functions to
where PostGIS is installed.

https://trac.osgeo.org/postgis/ticket/3490

To my disappointment, I noticed our spatial indexes no longer worked if I do
this since they rely on SQL inlining.

Schema qualifying our function calls is not an option at this time since

1) People install postgis in different schemas so we'd have to force them to
install in the same schema which I think will break a lot of 3rd party apps.
2) It's a lot of functions to hand touch.

Any suggestions are welcome.  Any other issues I should be aware of with 

ALTER FUNCTION .. set search_path..

Only other I noticed is it seems to be ignored in CREATE EXTENSION script
(at least when using dynamic execute).  I put it in and it seems to be
entirely ignored.

Thanks,
Regina








-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] On Conflict Do nothing errors IF conflict and there is a data type length or check failure

2016-02-16 Thread Regina Obe
I'm guessing this is by design but just wanted to confirm that since it
makes this feature not as useful for us.  

It also wasn't absolutely clear to me from the documentation.

We are running PostgreSQL 9.5.1 and if we do something like:

CREATE TABLE test(field1 varchar(5) primary key, field2 varchar(3));

INSERT INTO test(field1, field2) VALUES ('test','tes');

INSERT INTO test(field1,field2) VALUES('test', 'test')
ON CONFLICT(field1) DO NOTHING;

It triggers an error:

ERROR:  value too long for type character varying(3)

I think it does this for check constraints too.


Even though the record under consideration would be thrown out anyway.


Thanks,
Regina







-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers