from:"David Johnston"

Re: [HACKERS] Proposal: knowing detail of config files via SQL

2015-01-30 Thread David Johnston

On Fri, Jan 30, 2015 at 12:05 PM, David Fetter da...@fetter.org wrote:

 On Sat, Jan 31, 2015 at 12:50:20AM +0900, Sawada Masahiko wrote:
 
  [postgres][5432](1)=# select * from pg_file_settings where name =
 'work_mem';
  -[ RECORD 1 ]--
  name   | work_mem
  setting| 128MB
  sourcefile | /home/masahiko/pgsql/master/3data/hoge.conf
  -[ RECORD 2 ]--
  name   | work_mem
  setting| 64MB
  sourcefile | /home/masahiko/pgsql/master/3data/postgresql.auto.conf

 Masuhiko-san,

 I apologize for not communicating clearly.  My alternative proposal is
 to add a NULL-able sourcefile column to pg_settings.  This is as an
 alternative to creating a new pg_file_settings view.

 Why might the contents of pg_settings be different from what would be
 in pg_file_settings, apart from the existence of this column?



The contents of pg_settings uses the setting name as a primary key.

The contents of pg_file_setting uses a compound key consisting of the
setting name and the source file.

See work_mem in the provided example.

More specifically pg_settings only care about the currently active
setting while this cares about inactive settings as well.

David J.

Re: [HACKERS] Small bug on CREATE EXTENSION pgq...

2015-01-28 Thread David Johnston

On Wednesday, January 28, 2015, Stephen Frost sfr...@snowman.net wrote:

 * David G Johnston (david.g.johns...@gmail.com javascript:;) wrote:
  Jerry Sievers-3 wrote
   Hackers; I noticed this trying to import a large pg_dump file with
   warnings supressed.
  
   It seems loading pgq sets client_min_messages to warning and leaves it
   this way which defeats an attempt to change the setting prior and have
   it stick.
  
   I tested with several other extensions in same DB and only pgq has the
   problem.
  
   Sorry if this is known/fixed already.
 
  This is not part of PostgreSQL proper and thus not supported by -hackers;
  you should report this directly to the authors.

 Ehh..  Shouldn't we try to take a bit more care that we reset things
 after a CREATE EXTENSION is run?  Not really sure how much effort we
 want to put into it, but I see a bit of blame on both sides here.



Fair enough but reset to what?  I don't know the internal mechanics but
if the session default is warning and a local change sets it to notice
then an unconditional reset would not get us back to the intended value.

Now, if extensions can be handled similarly to how functions operate, where
one can define function/extension -local settings and have them revert
after resolution that might be ok.

That said, while there isn't any way to prevent it the log_min GUCs should
not be changed by extensions; that decision should be left up to the user.
The complaint is not that it should be reset but that the installation
script should not even care what the setting is.

David J.

Re: [HACKERS] Proposal: knowing detail of config files via SQL

2015-01-22 Thread David Johnston

On Thu, Jan 22, 2015 at 3:04 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 David G Johnston david.g.johns...@gmail.com writes:
  Tom Lane-2 wrote
  regression=# alter system reset timezone;
  ALTER SYSTEM
  regression=# select pg_reload_conf();

  How does someone know that performing the above commands will result in
 the
  TimeZone setting being changed from Asia/Shanghai to US/Eastern?

 Is that a requirement, and if so why?  Because this proposal doesn't
 guarantee any such knowledge AFAICS.



The proposal provides for SQL access to all possible sources of variable
value setting and, ideally, a means of ordering them in priority order, so
that a search for TimeZone would return two records, one for
postgresql.auto.conf and one for postgresql.conf - which are numbered 1 and
2 respectively - so that in looking at that result if the
postgresql.auto.conf entry were to be removed the user would know that what
the value is in postgresql.conf that would become active.  Furthermore, if
postgresql.conf has a setting AND there is a mapping in an #included file
that information would be accessible via SQL as well.

David J.

Re: [HACKERS] Proposal: knowing detail of config files via SQL

2015-01-22 Thread David Johnston

On Thu, Jan 22, 2015 at 3:19 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 David Johnston david.g.johns...@gmail.com writes:
  On Thu, Jan 22, 2015 at 3:04 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Is that a requirement, and if so why?  Because this proposal doesn't
  guarantee any such knowledge AFAICS.

  The proposal provides for SQL access to all possible sources of variable
  value setting and, ideally, a means of ordering them in priority order,
 so
  that a search for TimeZone would return two records, one for
  postgresql.auto.conf and one for postgresql.conf - which are numbered 1
 and
  2 respectively - so that in looking at that result if the
  postgresql.auto.conf entry were to be removed the user would know that
 what
  the value is in postgresql.conf that would become active.  Furthermore,
 if
  postgresql.conf has a setting AND there is a mapping in an #included file
  that information would be accessible via SQL as well.

 I know what the proposal is.  What I am questioning is the use-case that
 justifies having us build and support all this extra mechanism.  How often
 does anyone need to know what the next down variable value would be?
 And if they do need to know whether a variable is set in postgresql.conf,
 how often wouldn't they just resort to grep instead?  (Among other
 points, grep would succeed at noticing commented-out entries, which this
 mechanism would not.)

 GUC has existed in more or less its current state for about 15 years,
 and I don't recall a lot of complaints that would be solved by this.
 Furthermore, given that ALTER SYSTEM was sold to us as more or less
 obsoleting manual editing of postgresql.conf, I rather doubt that it's
 changed the basis of discussion all that much.


i doubt we'd actually see many complaints since, like you say, people are
likely to just use a different technique.  The only thing PostgreSQL itself
needs to provide is a master inventory of seen/known settings; the user
interface can be left up to other layers (psql, pgadmin, extensions,
etc...).  It falls into making the system more user friendly.  But maybe
the answer for those users is that if you setup is so complex this would be
helpful you probably need to rethink your setup.  Then again, if I only
interact with the via SQL at least can issue RESET and know the end result
- after a config reload - without having to log into the server and grep
the config file - or know what the system defaults are for settings that do
not appear explicitly in postgresql.conf; all without having to refer to
documentation as well.

I'm doubtful it is going to interest any of the core hackers to put this in
place but it at least warrants a couple of passes of brain-storming which
can then be memorialized on the Wiki-ToDo.

David J.

Re: [HACKERS] Merging postgresql.conf and postgresql.auto.conf

2015-01-20 Thread David Johnston

On Tue, Jan 20, 2015 at 1:24 PM, Jim Nasby jim.na...@bluetreble.com wrote:

 On 1/16/15 10:32 PM, David G Johnston wrote:



 Two changes solve this problem in what seems to be a clean way.
 1) Upon each parsing of postgresql.conf we store all assigned variables
 somewhere


 Parsing is relatively cheap, and it's not like we need high performance
 from this. So, -1 on permanent storage.


OK





  2) We display these assignments in a new pg_settings column named
 system_reset_val

 I would also extend this to include:
 a) upon each parsing of postgresql.auto.conf we store all assigned
 variables
 somewhere (maybe the same place as postgresql.conf and simply label the
 file
 source)


 You can not assume there are only postgresql.conf and
 postgresql.auto.conf. Complex environments will have multiple included
 files.


#include files still appear to come from postgresql.conf; I'm not
proposing we try and memorize every single instance of a variable
declaration and provide a global overlaps query - though that piece
already seems to be in place...


  b) add an alter_system_val field to show that value (or null)
 c) add a db_role_val to show the current value for the session via
 pg_db_role_setting


 You're forgetting that there are also per-role settings. And I'm with
 Robert; what's wrong with sourcefile and sourceline? Perhaps we just need
 to teach those about ALTER ROLE SET and ALTER DATABASE SET (if they don't
 already know about them).


Actually, there are not separate PER ROLE and PER DATABASE settings even
though there are different SQL commands.  The catalog is simply
pg_db_role_setting with the use of all tags (i.e., 0) as necessary.
But I see where you are going and do not disagree that precedence
information could be useful.

sourceline and sourcefile pertain only to the current value while the
point of adding these other pieces is to provide a snapshot of all the
different mappings that the system knows about; instead of having to tell a
user to go look in two different files (and associated includes) and a
database catalog to find out what possible values are in place.  That
doesn't solve the need to scan the catalog to see other possible values -
though you could at least put a counter in pg_settings that indicates how
many pg_db_role_setting entries reference the given variable so that if
non-zero the user is clued into the fact that they need to check out said
catalog table.




  c.1) add a db_role_id to show the named user that is being used for the
 db_role_val lookup

 The thinking for c.1 is that in situations with role hierarchies and SET
 ROLE usage it would be nice to be able to query what the connection role -
 the one used during variable lookup - is.


 I'm losing track of exactly what we're trying to solve here, but...

 If the goal is to figure out what settings would be in place for a
 specific user connecting to a specific database, then we should create a
 SRF that does just that (accepting a database name and role name). You
 could then do...

 SELECT * FROM pg_show_all_settings( 'database', 'role' ) a;


This is fine - but I'm thinking about situations where a user immediately
SET ROLE on their session and typically operate as said user; if they try
doing an ALTER ROLE SET for this SET ROLE user it will not work because
their login user is what matters.  This is probably a solution looking for
a problem but it is a dynamic that one needs to be aware of.


  I'm probably going overkill on this but there are not a lot of difference
 sources nor do they change frequently so extending the pg_settings view to
 be more of a one-stop-shopping for this information seems to make sense.


 Speaking of overkill... one thing that you currently can't do is find out
 what #includes have been processed. Perhaps it's worth having a SRF that
 would return that info...

  As it relates back to this thread the desired merging ends up being done
 inside this view and at least gives the DBA a central location (well,
 along
 with pg_db_role_setting) to go and look at the configuration landscape for
 the system.


 I think the goal is good, but the interface needs to be rethought.
 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com


My main concern with the UI would be too-much-information; but otherwise
if we accept the premise that we should include as much as possible and
then let the user (or us) provide more useful subsets based upon common
use-cases the main issue is making sure we've identified all of the
relevant information that needs to be captured - even if it is something as
minor as the whole logon vs. current role dynamic.  I'm ignoring the
cost/benefit aspect of implementation for the moment largely because I do
known enough about the backend to make reasonable comments.  But much of
the data is already present in various views and catalogs - I just think
having one-stop-shop would be useful and go a long

Re: [HACKERS] Merging postgresql.conf and postgresql.auto.conf

2015-01-20 Thread David Johnston

On Tue, Jan 20, 2015 at 8:46 PM, Robert Haas robertmh...@gmail.com wrote:

 On Tue, Jan 20, 2015 at 4:07 PM, David Johnston
 david.g.johns...@gmail.com wrote:
  sourceline and sourcefile pertain only to the current value while the
 point
  of adding these other pieces is to provide a snapshot of all the
 different
  mappings that the system knows about; instead of having to tell a user
 to go
  look in two different files (and associated includes) and a database
 catalog
  to find out what possible values are in place.  That doesn't solve the
 need
  to scan the catalog to see other possible values - though you could at
 least
  put a counter in pg_settings that indicates how many pg_db_role_setting
  entries reference the given variable so that if non-zero the user is
 clued
  into the fact that they need to check out said catalog table.

 This last proposal seems pointless to me.  If the user knows about
 pg_db_role_setting, they will know to check it; if they don't, a
 counter won't fix that.  I can see that there might be some utility to
 a query that would tell you, for a given setting, all sources of that
 setting the system knew about, whether in configuration files,
 pg_db_role_setting, or the current session.  But I don't see that
 putting information that's already available via one system catalog
 query into a different system catalog query helps anything - we should
 presume DBAs know how to write SQL.


While this is not exactly a high-traffic catalog/view why have them write
a likely 4-way join query (pg_db_role_setting is not the friendly in its
current form), or even two separate queries, when we can give them a solid
and comprehensive view standard.  I guess part of the mixup is that I am
basically talking about a view of multiple catalogs as a DBA UI as opposed
to really even caring what specific catalogs (existing or otherwise) or
functions are needed

to make the whole thing work.  Maybe it does all fit directly on
pg_settings but tacking on some read-only columns to this updateable
view/table doesn't come across as something that should be forbidden in
general.

Maybe I am imagining a use-case that just isn't there but if there are two
separate queries needed, and we call one consolidated, then having that
query indicate whether the other query has useful information, and it is
quite likely that it will not, avoids the user expending the effort to run
the wasteful secondary query.

David J.

Re: [HACKERS] Merging postgresql.conf and postgresql.auto.conf

2015-01-16 Thread David Johnston

On Fri, Jan 16, 2015 at 10:08 PM, Amit Kapila amit.kapil...@gmail.com
wrote:

 On Sat, Jan 17, 2015 at 10:02 AM, David G Johnston 
 david.g.johns...@gmail.com wrote:
   You're right.
   pg_setting and SHOW command use value in current session rather than
   config file.
   It might break these common infrastructure.
 
  Two changes solve this problem in what seems to be a clean way.
  1) Upon each parsing of postgresql.conf we store all assigned variables
  somewhere
  2) We display these assignments in a new pg_settings column named
  system_reset_val
 
  I would also extend this to include:
  a) upon each parsing of postgresql.auto.conf we store all assigned
 variables
  somewhere (maybe the same place as postgresql.conf and simply label the
 file
  source)

 Do we need to perform this parsing whenever user queries pg_settings?
 I think it might lead to extra cycles of reading file when user won't even
 need it and as the code is shared with SHOW commands that could be
 slightly complicated.


There would be no parsing upon reading of pg_settings, only lookups.  The
existing parsing would simply have its values saved to the catalogs that
will be involved in the underlying pg_setting view query.

David J.

Re: [HACKERS] Merging postgresql.conf and postgresql.auto.conf

2015-01-16 Thread David Johnston

On Fri, Jan 16, 2015 at 10:19 PM, Amit Kapila amit.kapil...@gmail.com
wrote:

 On Sat, Jan 17, 2015 at 10:41 AM, David Johnston 
 david.g.johns...@gmail.com wrote:
  On Fri, Jan 16, 2015 at 10:08 PM, Amit Kapila amit.kapil...@gmail.com
 wrote:
 
  On Sat, Jan 17, 2015 at 10:02 AM, David G Johnston 
 david.g.johns...@gmail.com wrote:
You're right.
pg_setting and SHOW command use value in current session rather than
config file.
It might break these common infrastructure.
  
   Two changes solve this problem in what seems to be a clean way.
   1) Upon each parsing of postgresql.conf we store all assigned
 variables
   somewhere
   2) We display these assignments in a new pg_settings column named
   system_reset_val
  
   I would also extend this to include:
   a) upon each parsing of postgresql.auto.conf we store all assigned
 variables
   somewhere (maybe the same place as postgresql.conf and simply label
 the file
   source)
 
  Do we need to perform this parsing whenever user queries pg_settings?
  I think it might lead to extra cycles of reading file when user won't
 even
  need it and as the code is shared with SHOW commands that could be
  slightly complicated.
 
 
  There would be no parsing upon reading of pg_settings, only lookups.
 The existing parsing would simply have its values saved to the catalogs
 that will be involved in the underlying pg_setting view query.
 
 So are you telling that whenever we read, save the settings
 to some catalog (probably a new one)?

 Will that completely address the problem specified in this thread,
 as those values could probably be old (when last time server is
 started or at last SIGHUP time values)?


Cache invalidation is a hard problem to solve :)

I am reasonably content with showing the user who has just updated their
postgresql.conf file and boots/SIGHUPs the server to find that said value
hasn't taken effect that their value is indeed sitting in postgresql.conf
but that other parts of the system are preempting it from being the active
value.

David J.

Re: [HACKERS] Re: Patch to add functionality to specify ORDER BY in CREATE FUNCTION for SRFs

2015-01-06 Thread David Johnston

On Tue, Jan 6, 2015 at 4:15 PM, Jim Nasby jim.na...@bluetreble.com wrote:

 On 1/6/15, 10:32 AM, Alvaro Herrera wrote:

 Tom Lane wrote:

  What would make sense to me is to teach the planner about inlining
 SQL functions that include ORDER BY clauses, so that the performance
 issue of a double sort could be avoided entirely transparently to
 the user.


 Wouldn't this be applicable to functions in other languages too, not
 only SQL?


 Dumb question... we can inline functions from other languages? What chunk
 of code handles that?


We cannot that I know of.  The point being made here is that suggesting an
alternative that requires inlining doesn't cover the entire purpose of
this feature since the feature can be applied to functions that cannot be
inlined.

David J.

Re: [HACKERS] [GENERAL] ON_ERROR_ROLLBACK

2014-12-30 Thread David Johnston

On Tue, Dec 30, 2014 at 8:54 AM, Adrian Klaver adrian.kla...@aklaver.com
wrote:

 On 12/30/2014 07:43 AM, David G Johnston wrote:

 Tom Lane-2 wrote

 Bernd Helmle lt;


  mailings@


  gt; writes:

 --On 29. Dezember 2014 12:55:11 -0500 Tom Lane lt;


  tgl@.pa


  gt; wrote:

 Given the lack of previous complaints, this probably isn't backpatching
 material, but it sure seems like a bit of attention to consistency
 would be warranted here.


  Now that i read it i remember a client complaining about this some time
 ago. I forgot about it, but i think there's value in it to backpatch.


 Hm.  Last night I wrote the attached draft patch, which I was intending
 to apply to HEAD only.  The argument against back-patching is basically
 that this might change the interpretation of scripts that had been
 accepted silently before.  For example
 \set ECHO_HIDDEN NoExec
 will now select noexec mode whereas before you silently got on mode.
 In one light this is certainly a bug fix, but in another it's just
 definitional instability.

 If we'd gotten a field bug report we might well have chosen to
 back-patch,
 though, and perhaps your client's complaint counts as that.

 Opinions anyone?


 -0.5 for back patching

 The one thing supporting this is that we'd potentially be fixing scripts
 that are broken but don't know it yet.  But the downside of changing
 active
 settings for working scripts - even if they are only accidentally working
 -
 is enough to counter that for me.  Being more liberal in our acceptance of
 input is more feature than bug fix even if we document that we accept more
 items.


 It is more about being consistent then liberal. Personally I think a
 situation where for one variable 0 = off but for another 0 = on,  is a bug


I can sorta buy the consistency angle but what will seal it for me is
script portability - the ability to write a script and instructions using
the most current release and have it run on previous versions without
having to worry about this kind of incompatibility.

So, +1 for back patching from me.

David J.

Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-11 Thread David Johnston

On Thu, Dec 11, 2014 at 2:05 PM, Bruce Momjian br...@momjian.us wrote:

 On Thu, Dec 11, 2014 at 10:04:43AM -0700, David G Johnston wrote:
  Tom Lane-2 wrote
   Robert Haas lt;
 
   robertmhaas@
 
   gt; writes:
   On Thu, Dec 11, 2014 at 1:18 AM, Josh Berkus lt;
 
   josh@
 
   gt; wrote:
   While there were technical
   issues, 9.4 dragged a considerable amount because most people were
   ignoring it in favor of 9.5 development.
  
   I think 9.4 dragged almost entirely because of one issue: the
   compressibility of JSONB.
  
   2. The amount of pre-release testing we get from people outside the
   hard-core development crowd seems to be continuing to decrease.
   We were fortunate that somebody found the JSONB issue before it was
   too late to do anything about it.  Personally, I'm very worried that
   there are other such bugs in 9.4.  But I've given up hoping that any
   more testing will happen until we put out something that calls itself
   9.4.0, which is why I voted to release in the core discussion about it.
 
  The compressibility properties of a new type seem like something that
 should
  be mandated before it is committed - it shouldn't require good fortune
 that

 Odd are the next problem will have nothing to do with compressibility
 --- we can't assume old failure will repeat themselves.




tl;dr: assign two people, a manager/curator and a lead reviewer.  Give
the curator better tools and the responsibility to engage the community.
If the primary reviewer cannot review a patch in the current commitfest it
can be marked awaiting reviewer and moved to the next CF for evaluation
by the next pair of individuals.  At minimum, though, if it does get moved
the manager AND reviewer need to comment on why it was not handled during
their CF.  Also, formalize documentation targeting developers and reviewers
just like the documentation for users has been formalized and committed to
the repository.  That knowledge and history is probably more important that
source code commit log and definitely should be more accessible to
newcomers.


While true a checklist of things to look for and evaluate when adding a
new type to the system still has value.  How new types interact, if at all,
with TOAST seems like something that warrants explicit attention before
commit, and there are probably others (e.g., OIDs, input/output function
volatility and configuration, etc...).  Maybe this exists somewhere but it
you are considering improvements to the commitfest application having an
top-of-page  always-visible checklist that can bootstrapped based upon the
patch/feature type and modified for specific nuances for the item in
question seems like it would be valuable.

If this was in place before 9.3 then the whatever category the multi-xact
patches fall into would want to have their template modified to incorporate
the various research and testing - along with links to the discussions -
the resulted from the various bug reports that were submitted.  It could
even be structured to both be an interactive checklist as well as acting as
curated documentation for developers and reviewers.  The wiki has some of
this but if the goal is to encourage people to learn how to contribute to
PostgreSQL it should receive a similar level of attention and quality
control that our documentation for people wanting to learn how to use
PostgreSQL receive.  But that is above and beyond the simple goal of having
meaningful checklists attached to each of the major commit-fest items whose
TODO items can be commented upon and serve as a reference for how close to
commit a feature may be.  Comments can be as simple as a place for people
to upload a psql script and say this is what I did and everything seemed
to work/fail in the way I expected - on this platform.

Curation is hard though so I get why easier - just provide links to the
mailing list mainly - actions are what is currently being used.  Instead of
the CF manager being a reviewer (which is a valid approach) having them be
a curator and providing better tools geared toward that role (both to do
the work and to share the results) seem like a better ideal.  Having that
role a CF reviewer should maybe also be assigned.  The two people -
reviewer and manager - would then be responsible for ensuring that reviews
are happening and that the communication to and recruitment from the
community is being handled - respectively.

Just some off-the-cuff thoughts...

David J.

Re: [HACKERS] controlling psql's use of the pager a bit more

2014-11-13 Thread David Johnston

On Thu, Nov 13, 2014 at 10:55 AM, Merlin Moncure mmonc...@gmail.com wrote:

 On Thu, Nov 13, 2014 at 11:39 AM, Robert Haas robertmh...@gmail.com
 wrote:
  On Thu, Nov 13, 2014 at 11:54 AM, David G Johnston
  david.g.johns...@gmail.com wrote:
  Because I might be quite happy with 100 or 200 lines I can just scroll
  in my terminal's scroll buffer, but want to use the pager for more than
  that. This is useful especially if I want to scroll back and see the
  results from a query or two ago.
 
  +1
 
  +1 from me, too.

 me three (as long as the current behaviors are not messed with and the
 new stuff is 'opt in' somehow -- I also use the force quit option to
 less).



Being able to use \pset pager

to toggle the capability seems useful.

Thus I'd suggest establishing a new pager_minlines option that if unset
(default) maintains the current behavior but which can have a value (0 to
int_max) where 0 ends up basically doing the same thing as always mode
for pager.  Leaving the value blank will cause the option to be unset
again and revert to the current behavior.

David J.

Re: [HACKERS] Re: [GENERAL] Performance issue with libpq prepared queries on 9.3 and 9.4

2014-11-13 Thread David Johnston

On Thu, Nov 13, 2014 at 5:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 David G Johnston david.g.johns...@gmail.com writes:
  Tom Lane-2 wrote
  In the meantime, I assume that your real data contains a small
 percentage
  of values other than these two?  If so, maybe cranking up the statistics
  target would help.  If the planner knows that there are more than two
  values in the column, I think it would be less optimistic about assuming
  that the comparison value is one of the big two.

  Is there any value (or can value be added) in creating a partial index of
  the form:

  archetype IN ('banner','some other rare value')

  such that the planner will see that such a value is possible but
 infrequent
  and will, in the presence of a plan using a value contained in the
 partial
  index, refuse to use a generic plan knowing that it will be unable to use
  the very specific index that the user created?

 The existence of such an index wouldn't alter the planner's statistics.
 In theory we could make it do so, but I seriously doubt the cost-benefit
 ratio is attractive, either as to implementation effort or the added
 planning cost.


 
[adding -general back in...]

While planner hints comes to mind...on the SQL side can we extend the
PREPARE command with two additional keywords?


PREPARE
 name [ ( data_type [, ...] ) ] [
[NO] GENERIC
] 
AS statement

I was originally thinking this could attach to EXECUTE and maybe it could
there as well.  If EXECUTE is bare whatever the PREPARE used would be in
effect (a bare PREPARE exhibiting the current dynamic behavior).  If
EXECUTE and PREPARE disagree execute wins and the current call is
(re-)prepared as requested.

We have introduced intelligence to PREPARE/EXECUTE that is not always
favorable but provide little way to override it if the user has superior
knowledge.  The dual role of prepared statements to both prevent
SQL-injection as well as create cache-able generic plans further
complicates things.  In effect by supplying NO GENERIC on the PREPARE the
caller is saying they only wish to make use of the SQL-injection aspect of
prepared statements.  Adding the EXECUTE piece allows for the same plan to
be used in injection-prevention mode if the caller knows that the
user-supplied value does not play well with the generic plan.

David J.

Re: [HACKERS] idea: allow AS label inside ROW constructor

2014-10-23 Thread David Johnston

On Thu, Oct 23, 2014 at 8:51 AM, Andrew Dunstan and...@dunslane.net wrote:


 On 10/23/2014 11:36 AM, David G Johnston wrote:

 Andrew Dunstan wrote

 On 10/23/2014 09:57 AM, Florian Pflug wrote:

 On Oct23, 2014, at 15:39 , Andrew Dunstan lt;

 andrew@
 gt; wrote:

 On 10/23/2014 09:27 AM, Merlin Moncure wrote:

 On Thu, Oct 23, 2014 at 4:34 AM, Pavel Stehule lt;

 pavel.stehule@
 gt; wrote:

 postgres=# select row_to_json(row(10 as A, row(30 as c, 20 AS B) as
 x));
row_to_json
 --
{a:10,x:{c:30,b:20}}
 (1 row)

  wow -- this is great.   I'll take a a look.

  Already in  9.4:

 andrew=# select
 json_build_object('a',10,'x',json_build_object('c',30,'b',20));
 json_build_object
 
 {a : 10, x : {c : 30, b : 20}}
 (1 row)
 So I'm not sure why we want another mechanism unless it's needed in
 some
 other context.

 I've wanted to name the field of rows created with ROW() on more than
 one occasion, quite independent from whether the resulting row is
 converted
 to JSON or not. And quite apart from usefulness, this is a matter of
 orthogonality. If we have named fields in anonymous record types, we
 should
 provide a convenient way of specifying the field names.

 So to summarize, I think this is an excellent idea, json_build_object
 non-withstanding.

  Well, I think we need to see those other use cases. The only use case I
 recall seeing involves the already provided case of constructing JSON.

 Even if it simply allows CTE and sibqueries to form anonymous record types
 which can then be re-expanded in the outer layer for table-like final
 output
 this feature would be useful.  When working with wide tables and using
 multiple aggregates and joins being able to avoid specifying individual
 columns repeatedly is quite desirable.

 It would be especially nice to not have to use as though, if the source
 fields are already so named.




 You can already name the output of CTEs and in many cases subqueries, too.
 Maybe if you or someone gave a concrete example of something you can't do
 that this would enable I'd be more convinced.

 cheers

 andrew


Mechanically I've wanted to do the following without creating an actual
type:

{query form}
WITH invoiceinfo (invoiceid, invoicedetailentry) AS (
SELECT invoiceid, ROW(itemid, itemdescription, itemcost, itemsale,
itemquantity)
FROM invoicelines
)
[... other CTE joins and stuff here...can carry around the 5 info fields in
a single composite until they are needed]
SELECT invoiceid, (invoicedetailentry).*
FROM invoiceinfo
;

{working example}
WITH invoiceinfo (invoiceid, invoicedetailentry) AS (
SELECT invoiceid, ROW(itemid, itemdescription, itemcost, itemsale,
itemquantity)
FROM (VALUES ('1',1,'1',0,1,1)) invoicelines (invoiceid, itemid,
itemdescription, itemcost, itemsale, itemquantity)
)
SELECT invoiceid, (invoicedetailentry).*
FROM invoiceinfo
;

This is made up but not dissimilar to what I have worked with.  That said I
can and do usually either just join in the details one time or I need to do
more with the details than just carry them around and so providing a named
type usually ends up being the way to go.  Regardless the form is
representative.

My most recent need for this ended up being best handled with named types
and support functions operating on those types so its hard to say I have a
strong desire for this but anyway.

David J.

Re: [HACKERS] Trailing comma support in SELECT statements

2014-10-16 Thread David Johnston


 
 

 We might as well allow a final trailing (or initial leading) comma on a
 values list at the same time:

 VALUES
 (...),
 (...),
 (...),

 

 do you know, so this feature is a proprietary and it is not based on
 ANSI/SQL? Any user, that use this feature and will to port to other
 database will hate it.

 Regards

 Pavel

 


I've got no complaint if at the same time means that neither behavior is
ever implemented...

David J.

Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-26 Thread David Johnston

On Fri, Sep 26, 2014 at 1:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Robert Haas robertmh...@gmail.com writes:
  If we want the narrowest possible fix for this, I think it's complain
  if a non-zero value would round to zero.  That fixes the original
  complaint and changes absolutely nothing else.  But I think that's
  kind of wussy.  Yeah, rounding 29 seconds down to a special magic
  value of 0 is more surprising than rounding 30 seconds up to a minute,
  but the latter is still surprising.  We're generally not averse to
  tighter validation, so why here?

 So in other words, if I set shared_buffers = 100KB, you are proposing
 that that be rejected because it's not an exact multiple of 8KB?  This
 seems like it's throwing away one of the fundamental reasons why we
 invented GUC units in the first place.


Both Robert and myself at one point made suggestions to this effect but I
believe at this point we are both good simply solving the 1 rounding and
calling it a day.


 I apparently have got to make this point one more time: if the user
 cares about the difference between 30sec and 1min, then we erred in
 designing the GUC in question; it should have had a smaller unit.
 I am completely not impressed by arguments based on such cases.
 The right fix for such a case is to choose a different unit for the GUC.


You are right - both those values are probably quite stupid to set
log_rotation_age to...but we cannot error if they choose 1min

Stephen suggested changing the unit but that particular cure seems to be
considerably worse than simply changing floor() to ceil()

I'm out of arguments for why the minimally invasive solution is, for me,
the best of the currently proposed solutions.  That option gets my +1.  I
have to ask someone else to actually write such a patch but it seems
straight forward enough as no one has complained that the solution would be
hard to implement - only that they dislike the idea.

David J

Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-26 Thread David Johnston

On Fri, Sep 26, 2014 at 2:02 PM, Robert Haas robertmh...@gmail.com wrote:

 On Fri, Sep 26, 2014 at 1:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Robert Haas robertmh...@gmail.com writes:
  If we want the narrowest possible fix for this, I think it's complain
  if a non-zero value would round to zero.  That fixes the original
  complaint and changes absolutely nothing else.  But I think that's
  kind of wussy.  Yeah, rounding 29 seconds down to a special magic
  value of 0 is more surprising than rounding 30 seconds up to a minute,
  but the latter is still surprising.  We're generally not averse to
  tighter validation, so why here?
 
  So in other words, if I set shared_buffers = 100KB, you are proposing
  that that be rejected because it's not an exact multiple of 8KB?

 Absolutely.  And if anyone is inconvenienced by that, then they should
 upgrade to a 386.  Seriously, who is going to set a value of
 shared_buffers that is not measured in MB?  And if they do, why
 shouldn't we complain if we can't honor the value exactly?  If they
 really put in a value that small, it's not stupid to think that the
 difference between 96kB and 104kB is significant.  Honestly, the most
 likely explanation for that value is that it's a developer doing
 testing.


Related
thought - why don't we allow the user to specify 1.5MB as a valid value?
 Since we don't the rounding on the 8kb stuff makes sense because not
everyone wants to choose between 1MB and 2MB.  A difference of 1 minute is
not as noticeable.

In the thread Tom linked to earlier the whole idea of a unit being 8kb
(instead 1 block) is problematic and this is just another symptom of that.

David J.

Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-26 Thread David Johnston

On Fri, Sep 26, 2014 at 2:27 PM, Stephen Frost sfr...@snowman.net wrote:


 Agreed- they're independent considerations and the original concern was
 about the nonzero-to-zero issue, so I'd suggest we address that first,
 though in doing so we will need to consider what *actual* min values we
 should have for some cases which currently allow going to zero for the
 special case and that, I believe, makes this all 9.5 material and allows
 us a bit more freedom in deciding how to hanlde things more generally.


This is 9.5 material because 1) it isn't all that critical and, 2) we DO
NOT want a system to not come up because of a GUC paring error after a
minor-release update.

I don't get where we need to do anything else besides that...the whole
actual min values comment is unclear to me.

My thought on rounding is simply no-complaint, no-change; round-to-nearest
would be my first choice if implementing anew.

David J.

Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-26 Thread David Johnston

On Friday, September 26, 2014, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:

 Tom Lane wrote:

  The impression I had was that Stephen was thinking of actually setting
  min_val to 1 (or whatever) and handling zero or -1 in some out-of-band
  fashion, perhaps by adding GUC flag bits showing those as allowable
  special cases.  I'm not sure how we would display such a state of affairs
  in pg_settings, but other than that it doesn't sound implausible.

 I would think that if we're going to have out of band values, we
 should just use off, not 0 or -1.



Except off is not always semantically correct - some GUCs (not sure which
ones ATM) use zero to mean 'default'.  I think -1 always means off.
Instead of 0 and -1 we'd need 'default' and 'off' with the ability to error
if there is no meaningful default defined or if a feature cannot be turned
off.

David J.

Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-26 Thread David Johnston

On Friday, September 26, 2014, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:

 David Johnston wrote:
  On Friday, September 26, 2014, Alvaro Herrera alvhe...@2ndquadrant.com
 javascript:;
  wrote:
 
   Tom Lane wrote:
  
The impression I had was that Stephen was thinking of actually
 setting
min_val to 1 (or whatever) and handling zero or -1 in some
 out-of-band
fashion, perhaps by adding GUC flag bits showing those as allowable
special cases.  I'm not sure how we would display such a state of
 affairs
in pg_settings, but other than that it doesn't sound implausible.
  
   I would think that if we're going to have out of band values, we
   should just use off, not 0 or -1.
 
  Except off is not always semantically correct - some GUCs (not sure
 which
  ones ATM) use zero to mean 'default'.  I think -1 always means off.
  Instead of 0 and -1 we'd need 'default' and 'off' with the ability to
 error
  if there is no meaningful default defined or if a feature cannot be
 turned
  off.

 Sure, off (and other spellings of boolean false value) and default
 where they make sense, and whatever other values that make sense.  My
 point is to avoid collapsing such logical values to integer/floating
 point values just because the option uses a numeric scale.


My comment was more about storage than data entry.  I'm not sure, though,
that we'd want to allow 0 as valid input even if it is acceptable for
Boolean.

David J.

Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-25 Thread David Johnston

On Thursday, September 25, 2014, Gregory Smith gregsmithpg...@gmail.com
wrote:

 On 9/25/14, 1:41 AM, David Johnston wrote:

 If the error message is written correctly most people upon seeing the
 error will simply fix their configuration and move on - regardless of
 whether they were proactive in doing so having read the release notes.


 The important part to realize here is that most people will never see such
 an error message.  There is a person/process who breaks the
 postgresql.conf, a process that asks for a configuration restart/reload
 (probably via pg_ctl, and then the postmaster program process creating a
 server log entry that shows the error (maybe in pgstartup.log, maybe in
 pg_log, maybe in syslog, maybe in the Windows Event Log)

 In practice, the top of that food chain never knows what's happening at
 the bottom unless something goes so seriously wrong the system is down,
 [...]


And if the GUC setting here is wrong the system will be down, right?
Otherwise the attempt at changing the setting will fail and so even if the
message itself is not seen the desired behavior of the system will remain
as it was - which is just as valid a decision rounding to zero or 1.

Just like we don't take responsibility for people not reading release notes
or checking their configuration if the DBA is not aware of where the GUCs
are being set and the logging destinations that not our problem.

David J.

Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-24 Thread David Johnston

On Thu, Sep 25, 2014 at 12:11 AM, Gregory Smith gregsmithpg...@gmail.com
wrote:

 On 9/24/14, 6:45 PM, Peter Eisentraut wrote:

 But then this proposal is just one of several others that break backward
 compatibility, and does so in a more or less silent way.  Then we might
 as well pick another approach that gets closer to the root of the problem.


 I was responding to some desire to get a quick fix that cut off the worst
 of the problem, and the trade-offs of the other ideas bothered me even
 more.  Obvious breakage is annoying, but small and really subtle version to
 version incompatibilities drive me really crazy. A 60X scale change to
 log_rotation_age will be big enough that impacted users should definitely
 notice, while not being so big it's scary.  Rounding differences are small
 enough to miss.  And I do see whacking the sole GUC that's set in minutes,
 which I was surprised to find is a thing that existed at all, as a useful
 step forward.


Why?  Do you agree that a log_rotation_age value defined in seconds is
sane?  If your reasoning is that everything else is defined in s and ms
then that is a developer, not a user, perspective.  I'll buy into the
everything is defined in the smallest possible unit approach - in which
case the whole rounding things becomes a non-issue - but if you leave some
of these in seconds then we should still add an error if someone defines an
insane millisecond value for those parameters.



 I can't agree with Stephen's optimism that some wording cleanup is all
 that's needed here.  I predict it will be an annoying, multiple commit job
 just to get the docs right, describing both the subtle rounding change and
 how it will impact users.  (Cry an extra tear for the translators)
 


Maybe I'm nieve but I'm seriously doubting this.  From what I can tell the
rounding isn't currently documented and really doesn't need much if any to
be added.  An error instead of rounding down to zero would be sufficient
and self-contained.  The value specified is less than 1 in the parameter's
base unit



 Meanwhile, I'd wager the entire work of my log_rotation_scale unit change
 idea, from code to docs to release notes, will take less time than just
 getting the docs done on any rounding change.  Probably cause less field
 breakage and confusion in the end too.


You've already admitted there will be breakage.  Your argument is that it
will be obvious enough to notice.  Why not just make it impossible?




 The bad case I threw out is a theorized one that shows why we can't go
 completely crazy with jerking units around, why 1000:1 adjustments are
 hard.  I'm not actually afraid of that example being in the wild in any
 significant quantity.  The resulting regression from a 60X scale change
 should not be so bad that people will suffer greatly from it either.
 Probably be like the big 10:1 change in default_statistics_target.  Seemed
 scary that some people might be nailed by it; didn't turn out to be such a
 big deal.

 I don't see any agreement on the real root of a problem here yet. That
 makes gauging whether any smaller change leads that way or not fuzzy.  I
 personally would be fine doing nothing right now, instead waiting until
 that's charted out--especially if the alternative is applying any of the
 rounding or error throwing ideas suggested so far.  I'd say that I hate to
 be that guy who tells everyone else they're wrong, but we all know I enjoy
 it.


Maybe not: I contend the root problem is that while we provide sane unit
specifications we've assumed that people will always be providing values
greater than 1 - but if they don't we silently use zero (a special value)
instead of telling them they messed up (made an error).  If the presence of
config.d and such make this untenable then I'd say those features have a
problem.- not our current choice to define what sane is.

David J.

Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-24 Thread David Johnston

On Thu, Sep 25, 2014 at 12:46 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Gregory Smith gregsmithpg...@gmail.com writes:
  I don't see any agreement on the real root of a problem here yet. That
  makes gauging whether any smaller change leads that way or not fuzzy.  I
  personally would be fine doing nothing right now, instead waiting until
  that's charted out--especially if the alternative is applying any of the
  rounding or error throwing ideas suggested so far.  I'd say that I hate
  to be that guy who tells everyone else they're wrong, but we all know I
  enjoy it.

 TBH I've also been wondering whether any of these proposed cures are
 better than the disease.  The changes that can be argued to make the
 behavior more sane are also ones that introduce backwards compatibility
 issues of one magnitude or another.  And I do not have a lot of sympathy
 for let's not change anything except to throw an error in a case that
 seems ambiguous.  That's mostly being pedantic, not helpful, especially
 seeing that the number of field complaints about it is indistinguishable
 from zero.


Then what does it matter that we'd choose to error-out?


 I am personally not as scared of backwards-compatibility problems as some
 other commenters: I do not think that there's ever been a commitment that
 postgresql.conf contents will carry forward blindly across major releases.
 So I'd be willing to break strict compatibility in the name of making the
 behavior less surprising.  But the solutions that have been proposed that
 hold to strict backwards compatibility requirements are not improvements
 IMHO.


Or, put differently, the pre-existing behavior is fine so don't fix what
isn't broken.

This patch simply fixes an oversight in the original implementation - that
someone might try to specify an invalid value (i.e., between 0 and 1).  if
0 and -1 are flags, then the minimum allowable value is 1.  The logic
should have been: range [1, something]; 0 (optionally); -1 (optionally).
Values abs(x) between 0 and 1 (exclusive) should be disallowed and, like an
attempt to specify 0.5 (without units), should throw an error.

David J.

Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-24 Thread David Johnston

On Thu, Sep 25, 2014 at 1:04 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 David Johnston david.g.johns...@gmail.com writes:
  On Thu, Sep 25, 2014 at 12:46 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  TBH I've also been wondering whether any of these proposed cures are
  better than the disease.  The changes that can be argued to make the
  behavior more sane are also ones that introduce backwards compatibility
  issues of one magnitude or another.  And I do not have a lot of sympathy
  for let's not change anything except to throw an error in a case that
  seems ambiguous.  That's mostly being pedantic, not helpful, especially
  seeing that the number of field complaints about it is indistinguishable
  from zero.

  Then what does it matter that we'd choose to error-out?

 The number of complaints about the *existing* behavior is indistinguishable
 from zero (AFAIR anyway).  It does not follow that deciding to throw an
 error where we did not before will draw no complaints.


Any change has the potential to draw complaints.  For you it seems that
hey, I upgraded to 9.5 and my logs are being rotated out every minute
now.  I thought I had that turned off is the desired complaint.  Greg
wants: hey, my 1 hour log rotation is now happening every minute.  If the
error message is written correctly most people upon seeing the error will
simply fix their configuration and move on - regardless of whether they
were proactive in doing so having read the release notes.

And, so what if we get some complaints?  We already disallow the specified
values (instead, turning them into zeros) so it isn't like we are further
restricting user capabilities.

If I was to commit a patch that didn't throw an error I'd ask the author to
provide an outline for each of the affected parameters and what it would
mean (if possible) for a setting currently rounded to zero to instead be
rounded to 1.

Its the same language in the release notes to get someone to avoid the
error as it is to get them to not be impacted by the rounding change so the
difference is those people who would not read the release notes.  The
error outcome is simple, direct, and fool-proof - and conveys the fact
that what they are asking for is invalid. It may be a little scary but at
least we can be sure nothing bad is happening in the system.  If the
argument is that there are two few possible instances out there to expend
the effort to catalog every parameter then there isn't enough surface area
to care about throwing an error. And I still maintain that anyone setting
values for a clean installation would rather be told their input is not
valid instead of silently making it so.

Changing the unit for log_rotate_age when their is basically no one
asking for that seems like the worse solution at face value; my +1 not
withstanding.  I gave that mostly because if you are going to overhaul the
system then making everything ms seems like the right thing to do.  I
think that such an effort would be a waste of resources.

I don't have clients so maybe my acceptance of errors is overly optimistic
- but in this case I truly don't see enough people even realizing that
there was a change and those few that do should be able to read the error
and make the same change that they would need to make anyway if the
rounding option is chosen.

My only concern here, and it probably applies to any solution, is that the
change is implemented properly so that all possible user input areas throw
the error correctly and as appropriate to the provided interface.  That is,
interactive use immediately errors out without changing the default values
while explicit/manual file changes cause the system to error at startup.  I
haven't looked into the code parts of this but I have to imagine this is
already covered since even with units and such invalid input is still
possible and needs to be addressed; this only add one more check to that
existing routine.

David J.

Re: [HACKERS] RLS feature has been committed

2014-09-23 Thread David Johnston

On Tue, Sep 23, 2014 at 9:09 AM, Andres Freund and...@2ndquadrant.com
wrote:

 On 2014-09-22 21:38:17 -0700, David G Johnston wrote:
  Robert Haas wrote
   It's difficult to imagine a more flagrant violation of process than
   committing a patch without any warning and without even *commenting*
   on the fact that clear objections to commit were made on a public
   mailing list.  If that is allowed to stand, what can we assume other
   than that Stephen, at least, has a blank check to change anything he
   wants, any time he wants, with no veto possible from anyone else?
 
  I'm of a mind to agree that this shouldn't have been committed...but I'm
 not
  seeing where Stephen has done sufficient wrong to justify crucifixion.

 I've not seen much in the way of 'crucifixion' before this email. And I
 explicitly *don't* think it's warranted. Also it's not happening.


I maybe got a little carried away with my hyperbole...



  At this point my hindsight says a strictly declaratory statement of
 reasons
  this is not ready combined with reverting the patch would have been
  sufficient; or even just a I am going to revert this for these reasons
  post.  The difference between building support for a revert and
 gathering a
  mob is a pretty thin line.

 The reason it's being discussed is to find a way to align the different
 views about when to commit stuff. The primary goal is *not* to revert
 the commit or anything but to make sure we're not slipping into
 procedures we all would regret. Which *really* can happen very
 easily. We're all humans and most of us have more than enough to do.


So, the second option then...and I'm sorry but this should never have
been committed tends to cause one to think it should therefore be reverted.


  Though I guess if you indeed feel that his actions were truly heinous you
  should also then put forth the proposal that his ability to commit be
  revoked.

 I think *you* are escalating this to something unwarranted here by the
 way you're painting the discussion.


Not everyone who reads -hackers knows all the people involved personally.
I had an initial reaction to these e-mails that I thought I would share,
nothing more.  I'm not going to quote the different comments that led me to
my feeling that the response to this was disproportionate to the offense
but after a first pass - which is all many people would do - that is what I
came away with.  Though you could say I fell into the very same trap by
reacting off my first impression...

David J.

Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-23 Thread David Johnston

On Tue, Sep 23, 2014 at 1:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 David Johnston david.g.johns...@gmail.com writes:
  My original concern was things that are rounded to zero now will not be
 in
  9.5 if the non-error solution is chosen.  The risk profile is extremely
  small but it is not theoretically zero.

 This is exactly the position I was characterizing as an excessively
 narrow-minded attachment to backwards compatibility.  We are trying to
 make the behavior better (as in less confusing), not guarantee that it's
 exactly the same.


I am going to assume that the feature designers were focused on wanting to
avoid:

1000 * 60 * 5

to get a 5-minute value set on a millisecond unit parameter.

The designer of the variable, in choosing a unit, has specified the minimum
value that they consider sane.  Attempting to record an insane value should
throw an error.

I do not support throwing an error on all attempts to round but specifying
a value less than 1 in the variable's unit should not be allowed.  If such
a value is proposed the user either made an error OR they misunderstand the
variable they are using.  In either case telling them of their error is
more friendly than allowing them to discover the problem on their own.

If we are only allowed to change the behavior by
 throwing errors in cases where we previously didn't, then we are
 voluntarily donning a straitjacket that will pretty much ensure Postgres
 doesn't improve any further.


I'm not proposing project-level policy here.

David J.

Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-23 Thread David Johnston

On Tue, Sep 23, 2014 at 3:05 PM, Greg Stark st...@mit.edu wrote:

 Fwiw I agree with TL2. The simplest, least surprising behaviour to explain
 to users is to say we round to nearest and if that means we rounded to zero
 (or another special value) we throw an error. We could list the minimum
 value in pg_settings and maybe in documentation.

I'm not sure TL2 would agree that you are agreeing with him...

To the other point the minimum unit-less value is 1.  The unit that is
applied is already listed in pg_settings and the documentation.  While 0
is allowed it is more of a flag than a value.

David J.

Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-23 Thread David Johnston

On Tue, Sep 23, 2014 at 10:11 PM, Gregory Smith gregsmithpg...@gmail.com
wrote:

 On 9/23/14, 1:21 AM, David Johnston wrote:

 This patch should fix the round-to-zero issue.  If someone wants to get
 rid of zero as a special case let them supply a separate patch for that
 improvement.


 I am going to wander into this fresh after just reading everything once
 (but with more than a little practice with real-world GUC mucking) and say
 that, no, it should not even do that.  The original complaint here is real
 and can be straightforward to fix, but I don't like any of the proposals so
 far.

 My suggestion:  fix the one really bad one in 9.5 with an adjustment to
 the units of log_rotation_age.  That's a small commit that should thank
 Tomonari Katsumata for discovering the issue and even suggesting a fix
 (that we don't use) and a release note; sample draft below.  Stop there.


+1

I'm not convinced minute is wrong but it does imply a level of
user-friendliness (or over-parenting) that we can do away with.



 We could just change the units for log_rotation_age to seconds, then let
 the person who asks for a 10s rotation time suffer for that decision only
 with many log files.  The person who instead chooses 10ms may find log
 rotation disabled altogether because that rounds to zero, but now we are
 not talking about surprises on something that seems sane--that's a fully
 unreasonable user request.


Given the following why not just pick ms for log_rotation_age now?




 Following that style of fix all the way through to the sort of release
 notes needed would give something like this:

 log_rotation_age is now set in units of seconds instead of minutes.
 Earlier installations of PostgreSQL that set this value directly, to a
 value in minutes, should change that setting to use a time unit before
 migrating to this version.


? are there any special considerations for people using pg_dump vs. those
using pg_upgrade?


 If I were feeling ambitious about breaking configurations with a long-term
 eye toward improvement, I'd be perfectly happy converting everything on
 this list to ms.  We live in 64 bit integer land now; who cares if the
 numbers are bigger?


 Then the rounding issues only impact sub-millisecond values, making all
 them squarely in nonsense setting land.  Users will be pushed very clearly
 to always use time units in their postgresql.conf files instead of guessing
 whether something is set in ms vs. seconds.  Seeing the reaction to a units
 change on log_rotation_age might be a useful test for how that sort of
 change plays out in the real world.


If we are going to go that far why not simply modify the GUC input routine
to require unit-values on these particular parameters?  Direct manipulation
of pg_settings probably would need some attention but everything else could
reject integers and non-unit-specifying text.  Allow the same input routine
to accept the constants on|off|default and convert them internally into
whatever the given GUC requires and you get the UI benefits without mucking
around with the implementation details.  Modify pg_settings accordingly to
hide those now-irrelevant pieces.  For UPDATE a trigger can be used to
enforce the text-only input requirement.

As long as we do not make microsecond µs a valid time-unit it is
impossible currently to directly specify a fraction of a millisecond.

David J.

Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-22 Thread David Johnston

On Tuesday, September 23, 2014, Tom Lane t...@sss.pgh.pa.us wrote:

 David G Johnston david.g.johns...@gmail.com javascript:; writes:
  Can you either change your mind back to this opinion you held last month
 or
  commit something you find acceptable - its not like anyone would revert
  something you commit... :)

 Hey, am I not allowed to change my mind :-) ?

 Seriously though, the main point I was making before stands: if the
 details of the rounding rule matter, then we messed up on choosing the
 units of the particular GUC.  The question is what are we going to do
 about zero being a special case.

  Everyone agrees non-zero must not round to zero; as long as that happens
 I'm
  not seeing anyone willing to spending any more effort on the details.

 I'm not entirely sure Peter agrees; he wanted to get rid of zero being
 a special case, rather than worry about making the rounding rule safe
 for that case.  But assuming that that's a minority position:
 it seems to me that adding a new error condition is more work for us,
 and more work for users too, and not an especially sane decision when
 viewed from a green-field perspective.  My proposal last month was in
 response to some folk who were arguing for a very narrow-minded
 definition of backwards compatibility ... but I don't think that's
 really where we should go.

 regards, tom lane


This patch should fix the round-to-zero issue.  If someone wants to get rid
of zero as a special case let them supply a separate patch for that
improvement.

My original concern was things that are rounded to zero now will not be in
9.5 if the non-error solution is chosen.  The risk profile is extremely
small but it is not theoretically zero.

David J.

Re: [HACKERS] PQputCopyEnd doesn't adhere to its API contract

2014-09-09 Thread David Johnston

On Tue, Sep 9, 2014 at 12:03 PM, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Sep 8, 2014 at 6:20 PM, David G Johnston
 david.g.johns...@gmail.com wrote:
  On Mon, Sep 8, 2014 at 11:45 AM, Robert Haas [via PostgreSQL] [hidden
  email] wrote:
 
  On Thu, Sep 4, 2014 at 6:38 PM, David Johnston
  [hidden email] wrote:
 


  Ignoring style and such did anything I wrote help to clarify your
  understanding of why pgPutCopyEnd does what it does?  As I say this and
  start to try and read the C code I think that it is a good source for
  understanding novice assumptions but there is a gap between the docs and
 the
  code - though I'm not sure you've identified the only (or even proper)
  location.

 Honestly, not really.  I thought the language I previously discussed
 with Tom was adequate for that; and I'm a little confused as to why
 we've gotten on what seems to be somewhat of a digression into nearby
 but otherwise-unrelated documentation issues.


My theory here is that if you discover a single point of confusion that
cannot by attributed to a typo (or something basic like that) then why not
go looking around and see if there are any other issues in nearby
code/documentation.  Furthermore, not being the writer of said
code/documentation, a fresh perspective of the entire body of work - and
not just the few lines you quoted - has value.  For me, if nothing else I
took it as a chance to learn and evaluate the status quo.  And as noted
below there are a couple of items to address in the nearby code even if the
documentation itself is accurate - I just took a very indirect route to
discover them.

Much of the current documentation is copy-and-pasted directly from the
source code - with a few comments tacked on for clarity.  I'm sure some
layout and style concerns were present during the copy-and-paste but the
user-facing documentation does not, and probably should not ideally,
directly emulate the source code comments.  Admittedly, in the libpq
section this can be considerably relaxed since the target user is likely a
C programmer.



  Unlike PQputCopyData there is no explicit logic (pqIsnonblocking(conn)
 in
  PQputCopyEnd for non-blocking mode (there is an implicit one via
 pqFlush).
  This does seem like an oversight - if a minor one since the likihood of
 not
  being able to add the EOF to the connection's buffer seems highly
 unlikely -
  but I would expect on the basis of symmetry alone - for both of them to
 have
  buffer filled testing logic.  And, depending on how large *errormsg is,
 the
  risk of being unable to place the data in the buffer isn't as small and
 the
  expected EOF case.

 Yeah, I think that's a bug in PQputCopyEnd().  It should have the same
 kind of pqCheckOutBufferSpace() check that PQputCopyData() does.


Anybody disagree here?





  I'm getting way beyond my knowledge level here but my assumption from the
  documentation was that the async mode behavior of returning zero revolved
  around retrying in order to place the data into the buffer - not
 retrying to
  make sure the buffer is empty.  The caller deals with that on their own
  based upon their blocking mode decision.  Though we would want to call
  pqFlush for blocking mode callers here since this function should block
  until the buffer is clear.

 PQputCopyEnd() already does flush; and PQputCopyData() shouldn't flush.


My point being pgFlush is sensitive to whether the caller is in blocking
mode.

Also, pgPutCopyData DOES FLUSH if the buffer is full...then attempts to
resize the buffer...then returns 0 (or -1 in blocking mode).  How absolute
is Tom's

Well, we certainly do NOT want a flush in PQputCopyData.

?


  So, I thought I agreed with your suggestion that if the final pqFlush
  returns a 1 that pqPutCopyEnd should return 0.

 Well, Tom set me straight on that; so I don't think we're considering
 any such change.  I think you need to make sure you understand the
 previous discussion in detail before proposing how to adjust the
 documentation (or the code).


That was largely what this exercise was about for me...having gone through
all this and now re-reading Tom's wording I do understand and agree.

For surround documentation I think we are good since pqPutCopyData already
tests the buffer and correctly returns 0 in non-blocking mode.  The issue
with pqPutCopyEnd seems to be a copy/paste error and an incorrect
assumption regarding non-blocking mode.


This doesn't address the usage bug we've been propagating where someone
very well might use non-blocking mode and, upon seeing a return value of 1
from pqPutCopyEnd, immediately call pqGetResult even if the buffer is not
completed flushed.  Our proposed documentation directs people to always
pgFlush poll after calling pqPutCopyEnd while the current documentation is
not so strict.  Is there anything to say/do about that fact or - more
importantly - is there any real risk here?

David J.

Re: [HACKERS] Re: [PATCH] parser: optionally warn about missing AS for column and table aliases

2014-09-05 Thread David Johnston

On Fri, Sep 5, 2014 at 6:27 PM, Marko Tiikkaja ma...@joh.to wrote:

 On 2014-09-05 11:19 PM, David G Johnston wrote:

 Marko Tiikkaja-4 wrote


  I probably couldn't mount a convincing defense of my opinion but at first
 blush I'd say we should pass on this.  Not with prejudice - looking at the
 issue periodically has merit - but right now it seems to be mostly adding
 clutter to the code without significant benefit.


 Are you talking about this particular option, or the notion of parser
 warnings in general?  Because if we're going to add a handful of warnings,
 I don't see why this couldn't be on that list.


This option implemented in this specific manner.


  Tangential - I'd rather see something like EXPLAIN (STATIC) that would
 allow a user to quickly invoke a built-in static SQL analyzer on their
 query
 and have it point this kind of thing out.  Adding a catalog for STATIC
 configurations in much the same way we have text search configurations
 would
 allow extensions and users to define their own rulesets that could be
 attached to their ROLE GUC: default_static_analyzer_ruleset and also
 passed in as a modifier in the EXPLAIN invocation.


 If you knew there's a good chance you might make a mistake, you could
 probably avoid doing it in the first place.  I make mistakes all the time,
 would I then always execute two commands when I want to do something?
 Having to run a special check my query please command seems silly.


My follow-on post posited a solution that still uses the EXPLAIN mechanism
but configured in a way so users can have it be always on if desired.



  Anyway, the idea of using a GUC and evaluating every query that is written
 (the added if statements), is not that appealing even if the impact of one
 more check is likely to be insignificant (is it?)


 I'm not sure how you would do this differently in the EXPLAIN (STATIC)
 case.  Those ifs have to be somewhere, and that's always a non-zero cost.

 That being said, yes, performance costs of course have to be evaluated
 carefully.


If you add lots more checks that is lots more ifs compared to a single if
to see if static analysis should be attempted in the first place.  For a
single option its largely a wash.  Beyond that I don't know enough to say
whether having the parser dispatch the query differently based upon it
being preceded with EXPLAIN (STATIC) or a standard query would be any
faster than a single IF for a single check.

The more options you can think of for a novice mode the more appealing a
formal static analysis interface becomes - both for execution and also
because the number of options being introduced and managed can be made into
formal configurations instead of an independent but related set of GUC
variables.

David J.

Re: [HACKERS] PQputCopyEnd doesn't adhere to its API contract

2014-09-04 Thread David Johnston

On Thu, Sep 4, 2014 at 5:13 PM, Robert Haas robertmh...@gmail.com wrote:

 On Thu, Sep 4, 2014 at 1:18 PM, David G Johnston
 david.g.johns...@gmail.com wrote:
  Specific observations would help though that is partly the idea - I've
 been
  more focused on clarity and organization even if it requires deviating
 from
  the current general documentation style.

 OK.

 -   to the network connection used by applicationlibpq/application.
 +   to the connection used by applicationlibpq/application.

 This change is unrelated to the topic at hand and seems not to be an
 improvement.


Fair point - though I did question the necessity of network in the
accompanying e-mail.

The implied suggestion is that if I do find any other areas that look like
they need fixing - even in the same file - I should separate them out into
a separate patch.  Though I have seen various while I was in there I also
fixed such-and-such commits previously so the line is at least somewhat
fluid.


 +  note
 +   para
 +Please review the function notes for specific interface protocols -
 +the following is a simplified overview.
 +   /para
 +  /note

 This seems pointless.  Of course general documentation will be less
 specific than documentation for specific functions.


The existing wording was being verbose in order to be correct.  In a
summary like this I'd trade being reasonably accurate and general for the
precision that is included elsewhere.



 +   First, the application issues the SQL
 commandCOPY/command command via functionPQexec/function or one
 +   of the equivalent functions.  The response
 +   will either be an error or a structnamePGresult/ object bearing
 +   a status code for literalPGRES_COPY_OUT/literal or
 +   literalPGRES_COPY_IN/literal call implied by the specified copy
 +   direction (TO or FROM respectively).

 This implies that the response won't be a PGresult in the error case,
 but of course the function has the same C result type either way.


One of the trade-offs I mentioned...its more style than anything but
removing the parenthetical (if there is not error in the command) and
writing it more directly seemed preferable in an overview such as this.

Better:  The function will either throw an error or return a PGresult
object[...]

+  para
 +   Second, the application should use the functions in this
 +   section to receive data rows or transmit buffer loads.  Buffer loads
 are
 +   not guaranteed to be processed until the copy transfer is completed.
 +  /para

 The main change here vs. the existing text is that you're now using
 the phase buffer loads to refer to what gets transmitted, and data
 rows to refer to what gets received.  The existing text uses the term
 data rows for both, which seems correct to me.  My first reaction on
 reading your revised text was wait, what's a buffer load?.


So, my generalization policy working in reverse - since the transmit side
does not have to be in complete rows implying that they are here is (albeit
acceptably) inaccurate.




 +   Third, as lastly, when the data transfer is
 +   complete the client must issue a PQgetResult to commit the copy
 transfer
 +   and get the final structnamePGresult/ object that indicates

 I assume you mean and lastly, since as lastly doesn't sound like
 good English.


Yep.




 -   At this point further SQL commands can be issued via
 -   functionPQexec/function.  (It is not possible to execute other SQL
 -   commands using the same connection while the commandCOPY/command
 -   operation is in progress.)

 Removing this text doesn't seem like a good idea.  It's a quite
 helpful clarification.  The note you've added in its place doesn't
 seem like a good substitute for it, and more generally, I think we
 should avoid the overuse of constructs like note.  Emphasis needs to
 be used minimally or it loses its meaning.


Was trying to remove repetition here - happy to consider alternative way
of doing so if the note is objectionable.



  If this is not acceptable I'm happy to incorporate the ideas of others to
  try and get the best of both worlds.

 +   para
 +The return value of both these function can be one of [-1, 0, 1]
 +whose meaning depends on whether you are in blocking or non-blocking
 mode.
 +   /para

 The use of braces to list a set of possible values is not standard in
 mathematics generally, our documentation, or any other documentation I
 have seen.


Agreed




 +   para
 +Non-Blocking Mode: A value of 1 means that the payload was
 +placed in the queue while a -1 means an immediate and permanent
 failure.
 +A return value of 0 means that the queue was full: you need to try
 again
 +at the next wait-ready.
 +   /para

 We generally avoid emphasizing captions or headings in this way.  The
 markup engine has no knowledge that Non-Blocking Mode is special; it
 will format and style this just as if you had written This is how it
 works: a value of 1 means  That's probably not

Re: [HACKERS] PL/pgSQL 2

2014-09-02 Thread David Johnston

On Tue, Sep 2, 2014 at 4:48 PM, Joshua D. Drake j...@commandprompt.com
wrote:


 On 09/02/2014 09:48 AM, Bruce Momjian wrote:

  As a case in point, EDB have spent quite a few man-years on their Oracle
 compatibility layer; and it's still not a terribly exact match, according
 to my colleagues who have looked at it.  So that is a tarbaby I don't
 personally care to touch ... even ignoring the fact that cutting off
 EDB's air supply wouldn't be a good thing for the community to do.


 What any commercial entity and the Community do are mutually exclusive and
 we can not and should not determine what features we will support based on
 any commercial endeavor.


From where I sit the mutually exclusive argument doesn't seem to be true
- and in fact is something I think would be bad if it were.  We shouldn't
be afraid to add features to core that vendors are offering but at the same
time the fact that the Oracle compatibility aspects are commercial instead
of in-core is a plus to help ensure that there are people making a decent
living off PostgreSQL and thus are invested in its future - and directly
enticed to improve our product in order to get them more converts.  I don't
believe the community wants to compete on that basis nor does necessarily
standardizing the layer and letting the vendors compete on consulting and
implementation services seem a strong investment for the community to make.

There is no way to consider development plans without considering what the
entire eco-system is doing: commercial and community both.  A blanket
statement like above is a good way to make sure you don't get too carried
away with letting commercial vendors provide things that should be in core;
but at the same time the hurdle becomes higher if those features can be had
commercially.

My $0.02

David J.

Re: [HACKERS] PL/pgSQL 2

2014-09-01 Thread David Johnston

On Mon, Sep 1, 2014 at 11:12 PM, Craig Ringer cr...@2ndquadrant.com wrote:

 On 09/02/2014 09:40 AM, David G Johnston wrote:
  Random thought as I wrote that: how about considering how pl/pgsql
  functionality can be generalize so that it is a database API that
  another language can call?  In that way the server would drive the core
  functionality and the language would simply be an interpreter that
  enforces its specific notion of acceptable syntax.

 That's pretty much what we already have with the SPI and procedural
 language handler infrastructure. PL/Perl, PL/Python, etc exist because
 we have this.

 What do you see as missing from the current infrastructure? What can't
 be done that should be able to be done in those languages?


Yet pl/pgsql does not have to use SPI-interface type calls to interact
with PostgreSQL at the SQL level...

I don't have an answer to your questions but the one I'm asking is whether
a particular language could hide all of the SPI stuff behind some custom
syntax so that it in effect looks similar to what pl/pgsql does today?  Or,
more to the point, does pl/pgsql use the same SPI interface behind the
scenes as PL/Perl or does it have its own special interface?

David J.

Re: [HACKERS] Built-in binning functions

2014-08-31 Thread David Johnston

On Sun, Aug 31, 2014 at 7:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 David G Johnston david.g.johns...@gmail.com writes:
  Since bucket is the 'verb' here (in this specific case meaning lookup
 the
  supplied value in the supplied bucket definition) and width is a
 modifier
  (the bucket specification describes an equal-width structure) I suggest
  literal_bucket(val, array[]) such that the bucket is still the verb but
  now the modifier describes a structure that is literally provided.

 It's a very considerable stretch to see bucket as a verb here :-).
 Maybe that's why the SQL committee's choice of function name seems
 so unnatural (to me anyway).

 I was wondering about bucket_index(), ie get the index of the bucket
 this value falls into.  Or get_bucket(), or get_bucket_index() if you
 like verbosity.

 regards, tom lane


I got stuck on the thought that a function name should ideally be/include
a verb...

Even if you read it as a noun (and thus the verb is an implied get) the
naming logic still holds.

I pondered a get_ version though the argument for avoiding conflicting
user-code decreases its appeal.

The good part about SQL standard naming is that the typical programmer
isn't likely to pick a conflicting name.

bucket_index is appealing by itself though user-code probable...as bad as
I think width_bucket is for a name the fact is that it currently exists
and, even forced, consistency has merit.

David J.

Re: [HACKERS] Fixed redundant i18n strings in json

2014-08-07 Thread David Johnston

On Thu, Aug 7, 2014 at 5:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 David G Johnston david.g.johns...@gmail.com writes:
  Tom Lane-2 wrote
  Surely that was meant to read invalid number OF arguments.  The
 errhint
  is only charitably described as English, as well.  I'd suggest something
  like Arguments of json_build_object() must be pairs of keys and
 values.
  --- but maybe someone else can phrase that better.

  The user documentation is worth emulating here:
  http://www.postgresql.org/docs/9.4/interactive/functions-json.html

  errmsg(argument count must be divisible by 2)
  errhint(The argument list consists of alternating names and values)

 Seems reasonable to me.

  Note that I s/keys/names/ to match said documentation.

 Hm.  The docs aren't too consistent either: there are several other nearby
 places that say keys.  Notably, the functions json[b]_object_keys() have
 that usage embedded in their names, where we can't readily change it.

 I'm inclined to think we should s/names/keys/ in the docs instead.
 Thoughts?


Agreed - have the docs match the common API term usage in our
implementation.

Not sure its worth a thorough hunt but at least fix them as they are
noticed.

David J.

Re: [HACKERS] Re: Proposed changing the definition of decade for date_trunc and extract

2014-08-01 Thread David Johnston

On Fri, Aug 1, 2014 at 8:15 PM, Gavin Flower gavinflo...@archidevsys.co.nz
wrote:

 On 02/08/14 12:32, David G Johnston wrote:


 Any supporting arguments for 1-10 = 1st decade other than technical
 perfection?  I guess if you use data around and before 1AD you care about
 this more, and rightly so, but given sound arguments for both methods the
 one more useful to more users who I suspect dominantly care about years 
 1900.

 So -1 to change for breaking backward compatibility and -1 because the
 current behavior seems to be more useful in everyday usage.

  Since there was no year zero: then it follows that the first decade
 comprises years 1 to 10, and the current Millennium started in 2001 - or am
 I being too logical???   :-)


This is SQL, only relational logic matters.  All other logic can be
superseded by committee consensus.

IOW - and while I have no way of checking - this seems like something that
may be governed by the SQL standard...in which case adherence to that would
trump mathematical logic.

David J.

Re: [HACKERS] Is there a way to temporarily disable a index

2014-07-11 Thread David Johnston

On Fri, Jul 11, 2014 at 12:12 PM, Michael Banck mba...@gmx.net wrote:

 On Fri, Jul 11, 2014 at 11:07:21AM -0400, Tom Lane wrote:
  David G Johnston david.g.johns...@gmail.com writes:
   Benedikt Grundmann wrote
   That is it possible to tell the planner that index is off limits
   i.e.
   don't ever generate a plan using it?
 
   Catalog hacking could work but not recommended (nor do I know the
   proper
   commands and limitations).  Do you need the database/table to accept
   writes
   during the testing period?
 
  Hacking pg_index.indisvalid could work, given a reasonably recent PG.
  I would not try it in production until I'd tested it ;-)

 I wonder whether this should be exposed at the SQL level?  Hacking
 pg_index is left to superusers, but the creator of an index (or the
 owner of the schema) might want to experiment with disabling indices
 while debugging query plans as well.

 Turns out this is already in the TODO, Steve Singer has requested this
 (in particular, ALTER TABLE ...  ENABLE|DISABLE INDEX ...) in

 http://www.postgresql.org/message-id/87hbegz5ir@cbbrowne.afilias-int.info
 (as linked to from the TODO wiki page), but the neighboring discussion
 was mostly about FK constraints.

 Thoughts?


 Michael


Apparently work is ongoing on to allow EXPLAIN to calculate the impact a
particular index has on table writes.  What is needed is a mechanism to
temporarily facilitate the remove impact of specific indexes on reads
without having to disable the index for writing.  Ideally on a per-query
basis so altering the catalog doesn't make sense.  I know we do not want
traditional planner hints but in the spirit of the existing
enable_indexscan GUC there should be a 
disable_readofindex='table1.index1,table1.index2,table2.index1'  GUC
capability that would allow for session, user, or system-level control of
which indexes are to be used during table reads.

David J.

Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread David Johnston

On Fri, Jun 13, 2014 at 11:32 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 David G Johnston david.g.johns...@gmail.com writes:
  Tom Lane-2 wrote
  At the very least I think we should stay away from this syntax until
  the SQL committee understand it better than they evidently do today.
  I don't want to implement it and then get caught by a future
  clarification that resolves the issue differently than we did.

  Its not quite as unclear as you make it out to be:

 Yes it is.


Not withstanding the decision making of the SQL committee I was rejecting
as inconsistent:

SET random_1 = 0;
SET random_2 = 0;
SET random_3 = random(1234); 

The ambiguity regarding re-execute or copy still remains.


 That's not the reading I want, and it's not the reading you want either,
 but there is nothing in the existing text that justifies single
 evaluation.  So I think we'd be well advised to sit on our hands until
 the committee clarifies that.  It's not like there is some urgent reason
 to have this feature.



Agreed.


I don't suppose there is any support or prohibition on the :

one,two,three integer := generate_series(1,3);

interpretation...not that I can actually come up with a good use case that
wouldn't be better implemented via a loop in the main body.

David J.

Re: [HACKERS] RETURNING PRIMARY KEY syntax extension

2014-06-09 Thread David Johnston

On Monday, June 9, 2014, Ian Barwick i...@2ndquadrant.com wrote:



 On 09/06/14 14:47, David G Johnston wrote:

 Ian Barwick wrote

 Hi,

 The JDBC API provides the getGeneratedKeys() method as a way of
 retrieving
 primary key values without the need to explicitly specify the primary key
 column(s). This is a widely-used feature, however the implementation has
 significant
 performance drawbacks.

 Currently this feature is implemented in the JDBC driver by appending
 RETURNING * to the supplied statement. However this means all columns
 of
 affected rows will be returned to the client, which causes significant
 performance problems, particularly on wide tables. To mitigate this, it
 would
 be desirable to enable the JDBC driver to request only the primary key
 value(s).



ISTM that having a non-null returning clause variable when no returning is
present in the command makes things more complicated and introduces
unnecessary checks in the not uncommon case of multiple
non-returning commands being issued in series.

returningList was able to be null and so should returningClause.  Then if
non-null first check for the easy column listing and then check for the
more expensive PK lookup request.

Then again the extra returning checks may just amount noise.

David J.

Re: [HACKERS] 9.4 release notes

2014-05-19 Thread David Johnston

On Mon, May 19, 2014 at 10:23 AM, Bruce Momjian br...@momjian.us wrote:

 On Thu, May 15, 2014 at 06:08:47PM -0700, David G Johnston wrote:
  Some errors and suggestions - my apologizes for the format as I do not
 have
  a proper patching routine setup.
 

 Sorry, let me address some items I skipped on your list:

  IIUC: Logical decoding allows for streaming of statement-scoped database
  changes.

 I think Logical decoding does more than statement-scoped database
 changes, e.g. it can enable multi-master without triggers.  I am
 hesitant to mention specific items in the release notes for that reason.


Yeah, probably need to look at this as a bigger unit of work and better
understand it first.  But
to be optionally streamed in a configurable format seems too vague.


  IIUC: Remove line length restrictions from pgbench.


 Uh, there is a specific place we removed the ne length restriction in
 pg_bench.


I simply interpreted:

Allow pgbench to process script files of any line length (Sawada Masahiko)

The previous line limit was BUFSIZ.

script files of any line length just doesn't sound right to me; and if my
re-wording is wrong then it also does not actually communicate what was
changed very well.



  style: add comma - Previously[,] empty arrays were returned (occurs
  frequently but is minor)

 Thanks for the comma adjustments --- I can't decide on that sometimes.


Not a grammar expert by any means but there are generally two uses of
previously.

He previously removed that from the build.
Previously, he removed that from the build.

In the first case previously simply modifies removed while in the second
case previously modifies the entire fragment and thus acts as a transition
word.  In the second case you want the comma in the first you do not.  In
most cases a leading previously will be of the second form and so wants the
comma.

As an aside the original wording of the above example would imply that a
non-empty array was returned since a previously empty array is one that
now has content.


 FYI, the most frequently updated doc build is here:

 http://momjian.us/pgsql_docs/release-9-4.html

 --
   Bruce Momjian  br...@momjian.ushttp://momjian.us
   EnterpriseDB http://enterprisedb.com

   + Everyone has their own god. +


David J.

Re: [HACKERS] 9.4 release notes

2014-05-19 Thread David Johnston

On Mon, May 19, 2014 at 12:36 AM, Bruce Momjian br...@momjian.us wrote:

 On Thu, May 15, 2014 at 06:08:47PM -0700, David G Johnston wrote:
  Some errors and suggestions - my apologizes for the format as I do not
 have
  a proper patching routine setup.
 
  Patch Review - Top to Bottom (mostly, I think...)

 I have made your suggested adjustments in the attached applied patch.

  Add ROWS FROM() syntax to allow horizontal concatenation of set-returning
  functions in the FROM-clause (Andrew Gierth)
  - Maybe a note about using this to avoid least-common-multiple
 expansion?

 Sorry, I do not understand this.


Apparently, neither do I.  I think I was confusing this with set-returning
functions in the select-list.  In the from-clause comma-separated functions
are combined using a cross-join, not LCM-expansion.

--
   Bruce Momjian  br...@momjian.ushttp://momjian.us
   EnterpriseDB http://enterprisedb.com

   + Everyone has their own god. +



David J.

Re: [HACKERS] DISCARD ALL (Again)

2014-04-17 Thread David Johnston

On Thursday, April 17, 2014, Joshua D. Drake j...@commandprompt.com wrote:


 On 04/17/2014 07:07 PM, David G Johnston wrote:


 On 04/17/2014 05:24 PM, Tom Lane wrote:
   On the whole I'm not sure this is something we ought to get into.
   If you really need a fresh session, maybe you should start a
   fresh session.


 Isn't the whole point to avoid the reconnection overhead, especially for
 connection poolers?  DISCARD ALL shouldn't cause any cleanup that
 wouldn't otherwise occur when a session disconnects.  True global data
 (not just session global) should be excluded.


 The GD is global to the session only (Like temp tables).


Yes.  Tom's response makes it sound like the proposal is to throw away the
entire language environment for the whole server (thus needing super user
privilege) so I'm pointing out that what we are discussing is not that
invasive.




 A better wording of the promise would be: discard all leaves the
 session in the same state it would be in if the underlying connection
 were dropped and re-established.


 Except that it doesn't.


But is this what you intend it to mean, by implementing these features, or
are you thinking something different?

David J.

Re: [HACKERS] polymorphic SQL functions has a problem with domains

2014-04-02 Thread David Johnston

Tom Lane-2 wrote
 Pavel Stehule lt;

 pavel.stehule@

 gt; writes:
 I was informed about impossibility to use a polymorphic functions
 together
 with domain types
 
 see
 
  create domain xx as numeric(15);
 
 create or replace function g(anyelement, anyelement)
 returns anyelement as
 $$  select $1 + $2 $$
 language sql immutable;
 
 postgres=# select g(1::xx, 2::xx);
 ERROR:  return type mismatch in function declared to return xx
 DETAIL:  Actual return type is numeric.
 CONTEXT:  SQL function g during inlining
 
 That example doesn't say you can't use polymorphic functions with domains.
 It says that this particular polymorphic function definition is wrong:
 it is not making sure its result is of the expected data type.  I don't
 recall right now whether SQL functions will apply an implicit cast on the
 result for you, but even if they do, an upcast from numeric to some domain
 over numeric wouldn't be implicit.

How would that be possible though?  Since any number of domains could be
defined over numeric as soon as the + operator causes the domain to be
lost there is no way to get it back manually - you cannot just make it
SELECT ($1 + $2)::xx.

Does something like:

SELECT ($1 + $2)::$1%TYPE 

exist where you can explicitly cast to the type of the input argument?

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/polymorphic-SQL-functions-has-a-problem-with-domains-tp5798349p5798356.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] polymorphic SQL functions has a problem with domains

2014-04-02 Thread David Johnston

Tom Lane-2 wrote
 David Johnston lt;

 polobo@

 gt; writes:
 Does something like:
 SELECT ($1 + $2)::$1%TYPE 
 exist where you can explicitly cast to the type of the input argument?
 
 I don't think SQL-language functions have such a notation, but it's
 possible in plpgsql, if memory serves.

Indeed.

http://www.postgresql.org/docs/9.3/interactive/plpgsql-declarations.html#PLPGSQL-DECLARATION-TYPE

Section 40.3.3

You lose inlining but at least it (should) work.

David J.






--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/polymorphic-SQL-functions-has-a-problem-with-domains-tp5798349p5798367.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PQputCopyData dont signal error

2014-03-31 Thread David Johnston

steve k wrote
Am I to understand then that I should expect no error feedback if copy
fails because of something like attempting to insert alphabetic into a
numeric?

I apologize for my ignorance, but all my return codes were always
successful (PGRES_COMMAND_OK) even if nothing was copied due to garbage
data. Also, calling PQgetResult never returned any information either
because everything was always PGRES_COMMAND_OK.

If that's what is supposed to happen then I have completely missed the
boat and apologize for wasting everyone's time.

In your example you successfully sent an error message to the server and so
PQputCopyEnd does not report an error since it did what it was asked to do.
Later, when you finish the copy and ask for the error message, you probably
will get the same message that you sent here but you may get a different one
depending on whether the server encountered any other errors before you sent
the explicit error. Regardless of the message you are guaranteed to get
back an error after calling PQgetResult.

It seems to me that you need to supply a simple C program - along with a
test file that you expect to fail - that never reports an error so that
others may evaluate actual code. The likelihood of this NOT being human
error is small so we really have to see your actual code since that is most
probably the culprit.

Note that, as Tom mentioned, psql is open source. If you are trying to
duplicate its behavior in your own code you should probably look to see what
it does. The fact that you get the proper errors in some cases means that
the server is quite obviously capable of working in the manner you desire
and thus the client - again your specific software - it where any issue
likely resides.

Again, from what I'm reading PQputCopy(Data|End) will not report on data
parsing issues. You will only see those after issuing PQgetResult - which
is noted in the last paragraph of the PQputCopyEnd documentation excerpt you
provided. The put commands only report whether the sending of the data
was successful.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5798036.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PQputCopyData dont signal error

2014-03-31 Thread David Johnston

steve k wrote
 I started with this:  
 DBInsert_excerpts6_test_cpdlc.cpp
 http://postgresql.1045698.n5.nabble.com/file/n5798049/DBInsert_excerpts6_test_cpdlc.cpp
   
 

Can you point out to me where in that code you've followed this instruction
from the documentation:

After successfully calling PQputCopyEnd, call PQgetResult to obtain the
final result status of the COPY command. One can wait for this result to be
available in the usual way. Then return to normal operation.

Since I cannot seem to find it.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5798077.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PQputCopyData dont signal error

2014-03-31 Thread David Johnston

steve k wrote
 Sorry I can't provide more information but I do appreciate your time.  If
 you can't get any further with it I understand and don't expect another
 reply.  

For the benefit of others I'm reading this as basically you've found a
better way to do this so you are no longer concerned with correcting the
broken (incomplete) code you have provided.

It should be as simple as adding one more if statement between the copy-end
check and the overall failure check to see whether the copy command itself
failed in addition to the existing checks to see if sending the data or
ending the data send failed.

I will not pretend to write c code but the general logic and requirements
seems quite clear from the documentation I have read/shown.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5798115.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql \d+ and oid display

2014-03-29 Thread David Johnston

Bruce Momjian wrote
 On Fri, Mar 28, 2014 at 03:53:32PM -0300, Fabrízio de Royes Mello wrote:
 On Fri, Mar 28, 2014 at 3:41 PM, Tom Lane lt;

 tgl@.pa

 gt; wrote:
 
  Bruce Momjian lt;

 bruce@

 gt; writes:
   On Thu, Mar 27, 2014 at 02:54:26PM -0400, Stephen Frost wrote:
   I believe Bruce was suggesting to show it when it is set to *not*
 the
   default, which strikes me as perfectly reasonable.
 
   We seem to be split on the idea of having Has OIDs display only
 when
   the oid status of the table does not match the default_with_oids
   default.
 
  FWIW, I think that having the display depend on what that GUC is set to
  is a seriously *bad* idea.  It will mean that you don't actually know,
  when looking at the output of \d, whether the table has OIDs or not.
 
  I could get behind a proposal to suppress the line when there are not
  OIDs, full stop; that is, we print either Has OIDs: yes or nothing.
  But I think this patch just makes things even more surprising when
  default_with_oids is turned on.
 
 
 Something like the attached ?
 
 I assume it would be more like my attachment, i.e. since we are only
 displaying it when OIDs exist, there is no value for oid status field
 --- just say Has OIDs or Includes OIDs, or something like that.
 
 I know some people are saying there is no need to change the current
 output --- I am only saying that the importance of showing the lack of
 OIDs has lessened over the years, and we should reconsider its
 importance.  If we reconsider and still think we are fine, that's good
 with me.  I am saying we should not just keep doing this because we have
 always displayed it in the past.

As my belief is that 99% of the uses of \d are for human consumption
(because machines should in most cases hit the catalogs directly) then
strictly displaying Includes OIDs when appropriate has my +1.

Uses of \d+ in regression suites will be obvious and quickly fixed and
likely account for another 0.9%.

psql backslash commands are not machine API contracts and should be adapted
for optimal human consumption; thus neutering the argument for maintaining
backward compatibility.

David J.






--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/psql-d-and-oid-display-tp5797653p5797879.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PQputCopyData dont signal error

2014-03-29 Thread David Johnston

steve k wrote
I realize this is an old thread, but seems to be the only discussion I can
find on this topic I have a problem with PQputCopyData function. It
doesn't signal some error.

I am using from within a c++ program:
PQexec(m_pConn, COPY... ...FROM stdin),

followed by PQputCopyData(m_pConn, ssQuery.str().c_str(),
ssQuery.str().length()),

finished with PQputCopyEnd(m_pConn, NULL).

This may be over simplified but...

To summarize here the PQexec needs a matching PQgetResult. The PQputCopyEnd
only closes the preceding PQputCopyData. The server is not compelled to
process the copy data until the client says that no more data is coming.
Once the PQputCopyEnd has returned for practical purposes you back to the
generic command handling API and need to proceed as you would for any other
command - including being prepared to wait for long-running commands and
handle errors.

The request in this thread is for some means for the client to send partial
data and if the server has chosen to process that data (or maybe the client
can compel it to start...) AND has encountered an error then the client can
simply abort the rest of the copy and return an error. The current API
return values deal with the results of the actual call just performed and
not with any pre-existing state on the server. Tom's suggestion is to add a
polling method to query the current server state for those who care to
expend the overhead instead of imposing that overhead on all callers.

The C API seems strange to me, a non-C programmer, but at a cursory glance
it is at least internally consistent and does provide lots of flexibility
(especially for sync/async options) at the cost of complexity and having to
make sure that you code in the appropriate PQgetResult checks or risk
ignoring errors and reporting success incorrectly.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5797912.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql \d+ and oid display

2014-03-27 Thread David Johnston

Bruce Momjian wrote
 When we made OIDs optional, we added an oid status display to \d+:
 
   test= \d+ test
Table public.test
Column |  Type   | Modifiers | Storage | Stats target | Description
   +-+---+-+--+-
x  | integer |   | plain   |  |
 --   Has OIDs: no
 
 Do we want to continue displaying that OID line, or make it optional for
 cases where the value doesn't match default_with_oids?

If we didn't make it behave this way at the time of the change then what has
changed that we should make it behave this way now?  I like the logic
generally but not necessarily the change.  

The disadvantage of this change is users (both end and tools) of the data
now also have to look at what the default is (or was at the time the text
was generated) to know what a suppressed OIDs means.  Given how much
information the typical \d+ generates I would suspect that the added noise
this introduces is quickly ignored by frequent users and not all the
disruptive to those who use \d+ infrequently.  Tools likely would prefer is
to be always displayed.

My $0.02

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/psql-d-and-oid-display-tp5797653p5797707.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] History of WAL_LEVEL (archive vs hot_standby)

2014-03-27 Thread David Johnston

shamccoy wrote
Hello. I've been doing some benchmarks on performance / size differences
between actions when wal_level is set to either archive or hot_standby.
I'm not seeing a ton of difference. I've read some posts about
discussions as to whether this parameter should be simplified and remove
or merge these 2 values.

I'd like to understand the historic reason between have the extra
hot_standby value. Was it to introduce replication and not disturb the
already working archive value? If I'm new to Postgres, is there any
strategic reason to use archive at this point if replication is
something I'll be using in the future? I'm not seeing any downside to
hot_standby unless I'm missing something fundamental.

Thanks,
Shawn

There is a semantic difference in that archive is limited to wal
shipping to a dead-drop area for future PITR. hot_standby implies that
the wal is being sent to another running system that is immediately reading
in the information to maintain an exact live copy of the master.

As I think both can be used for PITR I don't believe there is much downside,
technically or with resources, to using hot_standby instead of archive; but
I do not imagine it having any practical benefit either.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/History-of-WAL-LEVEL-archive-vs-hot-standby-tp5797717p5797720.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] History of WAL_LEVEL (archive vs hot_standby)

2014-03-27 Thread David Johnston

Noah Misch-2 wrote
On Thu, Mar 27, 2014 at 03:06:02PM -0700, David Johnston wrote:
shamccoy wrote
Hello. I've been doing some benchmarks on performance / size
differences
between actions when wal_level is set to either archive or hot_standby.
I'm not seeing a ton of difference. I've read some posts about
discussions as to whether this parameter should be simplified and
remove
or merge these 2 values.

I'd like to understand the historic reason between have the extra
hot_standby value. Was it to introduce replication and not disturb
the
already working archive value?

I think so.

If I'm new to Postgres, is there any
strategic reason to use archive at this point if replication is
something I'll be using in the future? I'm not seeing any downside to
hot_standby unless I'm missing something fundamental.

Probably not. You might manage to construct a benchmark in which the
extra
master-side work is measurable, but it wouldn't be easy.

There is a semantic difference in that archive is limited to wal
shipping to a dead-drop area for future PITR. hot_standby implies
that
the wal is being sent to another running system that is immediately
reading
in the information to maintain an exact live copy of the master.

As I think both can be used for PITR I don't believe there is much
downside,
technically or with resources, to using hot_standby instead of archive;
but
I do not imagine it having any practical benefit either.

wal_level=archive vs. hot_standby is orthogonal to the mechanism for
transmitting WAL and time of applying WAL. Rather, it dictates whether a
PostgreSQL cluster replaying that WAL can accept read-only connections
during
recovery. You can send wal_level=archive log data over streaming
replication,
even synchronous streaming replication. However, any standby will be
unable
to accept connections until failover ends recovery. On the flip side, if
you
use wal_level=hot_standby, a backup undergoing PITR can accept read-only
connections while applying years-old WAL from an archive. That is useful
for
verifying, before you end recovery, that you have replayed far enough.

Went looking for this in the docs...

http://www.postgresql.org/docs/9.3/interactive/runtime-config-wal.html#GUC-WAL-LEVEL

So I guess, no-restore/offline/online would be good names (and maybe
wal_restore_mode instead of wal_level) if we started from scratch. Note
that no-restore does not preclude same-system recovery.

Just something to help me remember...

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/History-of-WAL-LEVEL-archive-vs-hot-standby-tp5797717p5797733.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] History of WAL_LEVEL (archive vs hot_standby)

2014-03-27 Thread David Johnston

David Johnston wrote

Noah Misch-2 wrote
On Thu, Mar 27, 2014 at 03:06:02PM -0700, David Johnston wrote:
shamccoy wrote
Hello. I've been doing some benchmarks on performance / size
differences
between actions when wal_level is set to either archive or
hot_standby.
I'm not seeing a ton of difference. I've read some posts about
discussions as to whether this parameter should be simplified and
remove
or merge these 2 values.

I'd like to understand the historic reason between have the extra
hot_standby value. Was it to introduce replication and not disturb
the
already working archive value?

I think so.

Probably not. You might manage to construct a benchmark in which the
extra
master-side work is measurable, but it wouldn't be easy.

There is a semantic difference in that archive is limited to wal
shipping to a dead-drop area for future PITR. hot_standby implies
that
the wal is being sent to another running system that is immediately
reading
in the information to maintain an exact live copy of the master.

As I think both can be used for PITR I don't believe there is much
downside,
technically or with resources, to using hot_standby instead of archive;
but
I do not imagine it having any practical benefit either.

http://www.postgresql.org/docs/9.3/interactive/runtime-config-wal.html#GUC-WAL-LEVEL

So I guess, no-restore/offline/online would be good names (and maybe
wal_restore_mode instead of wal_level) if we started from scratch. Note
that no-restore does not preclude same-system recovery.

Just something to help me remember...

David J.

Slightly tangential but are the locking operations associated with the
recent bugfix generated in both (all?) modes or only hot_standby? I thought
it strange that transient locking operations were output with WAL though I
get it if they are there to support read-only queries.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/History-of-WAL-LEVEL-archive-vs-hot-standby-tp5797717p5797735.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimum supported version of Python?

2014-03-18 Thread David Johnston

Peter Eisentraut-2 wrote
 On 3/18/14, 11:22 AM, Andrew Dunstan wrote:
 Actually, if you run a buildfarm animal you have considerable control
 over what it tests.
 
 I appreciate that.  My problem here isn't time or ideas or coding, but
 lack of hardware resources.  If I had hardware, I could set up tests for
 every build dependency under the sun.

I don't imagine there is enough churn in this area that having a constantly
testing buildfarm animal is a strong need; but a wiki page dedicated to
Python Support in PostgreSQL where we can publicly and officially release
testing results and commentary would be an improvement.

As it sounds like the only caveat to supporting 2.3 is that we don't
technically (or do we - is Decimal mandatory?) support an un-modified core
installation but require that at one add-on module be installed.  Assuming
that clears up all the errors Tom is seeing then saying that plpythonu works
with a slightly modified 2.3 on all current releases of PostgreSQL isn't a
stretch nor does it commit us to fixing bugs in the unlikely event that any
are discovered in two extremely stable environments.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Minimum-supported-version-of-Python-tp5796175p5796615.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Planner hints in Postgresql

2014-03-17 Thread David Johnston

Atri Sharma wrote
On Mon, Mar 17, 2014 at 9:22 PM, Rajmohan C lt;

csrajmohan@

gt; wrote:

I am implementing Planner hints in Postgresql to force the optimizer to
select a particular plan for a query on request from sql input. I am
having
trouble in modifying the planner code. I want to create a path node of
hint
plan and make it the plan to be used by executor. How do I enforce this ?
Should I create a new Plan for this ..how to create a plan node which can
be then given directly to executor for a particular query?

Planner hints have been discussed a lot before as well and AFAIK there is
a
wiki page that says why we shouldnt implement them. Have you referred to
them?

Please share if you have any new points on the same.

Regards,

Atri

http://wiki.postgresql.org/wiki/Todo

(I got to it via the FAQ link on the homepage and the Developer FAQ
section there-in. You should make sure you've scanned that as well.)

Note the final section titled: Features We Do Not Want

Also, you need to consider what you are doing when you cross-post (a bad
thing generally) -hackers and -novice. As there is, rightly IMO, no
-novice-hackers list you should have probably just hit up -general.

Need to discuss the general why before any meaningful help on the how is
going to be considered by hackers.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Planner-hints-in-Postgresql-tp5796347p5796353.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Planner hints in Postgresql

2014-03-17 Thread David Johnston

Atri Sharma wrote
 On Mon, Mar 17, 2014 at 9:45 PM, Tom Lane lt;

 tgl@.pa

 gt; wrote:
 
 David Johnston lt;

 polobo@

 gt; writes:
  Need to discuss the general why before any meaningful help on the
 how is
  going to be considered by hackers.

 Possibly worth noting is that in past discussions, we've concluded that
 the most sensible type of hint would not be use this plan at all, but
 here's what to assume about the selectivity of this WHERE clause.
 That seems considerably less likely to break than any attempt to directly
 specify plan details.


 Isnt using a user given value for selectivity a pretty risky situation as
 it can horribly screw up the plan selection?
 
 Why not allow the user to specify an alternate plan and have the planner
 assign a higher preference to it during plan evaluation? This shall allow
 us to still have a fair evaluation of all possible plans as we do right
 now
 and yet have a higher preference for the user given plan during
 evaluation?

The larger question to answer first is whether we want to implement
something that is deterministic...

How about just dropping the whole concept of hinting and provide a way for
someone to say use this plan, or die trying.  Maybe require it be used in
conjunction with named PREPAREd statements:

PREPARE s1 (USING /path/to/plan_def_on_server_or_something_similar) AS
SELECT ...;

Aside from whole-plan specification I can definitely see where join/where
specification could be useful if it can overcome the current limitation of
not being able to calculate inter-table estimations.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Planner-hints-in-Postgresql-tp5796347p5796378.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wiki Page Draft for upcoming release

2014-03-17 Thread David Johnston

I sent a post to -general with a much more detailed brain dump of my current
understanding on this topic.  The main point I'm addressing here is how to
recover from this problem.

Since a symptom of the problem is that pg_dump/restore can fail saying that
(in some instances) the only viable restore mechanism be pg_dump/restore
means that someone so afflicted is going to lose data since their last good
dump - if they still have one.

However, if the true data table does not actually contain any duplicate data
then such a dump/restore cycle (or I would think REINDEX - or DROP/CREATE
INDEX chain) should resolve the problem.  Thus if there is duplicate data
the user needs-to/can identify and remove the offending records so that
subsequent actions do not fail with a duplicate key error.

If this is true then providing a query (or queries) that can provide the
problem records and delete them from the table - along with any staging up
that is necessary (like first dropping affected indexes if applicable) -
would be good a nice addition.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Wiki-Page-Draft-for-upcoming-release-tp5796494p5796503.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wiki Page Draft for upcoming release

2014-03-17 Thread David Johnston

Josh Berkus wrote
 All,
 
 https://wiki.postgresql.org/wiki/20140320UpdateIssues
 
 I'm sure my explanation of the data corruption issue is not correct, so
 please fix it.  Thanks!

I presume that because there is no way the master could have sent bad table
data to the replication slaves that performing the base backup on the slaves
is sufficient; i.e., it is not necessary to backup the master and distribute
that?

I'd either make the change myself or ask this kind of question somewhere
else more appropriate but I'm not sure of the proper protocol to follow.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Wiki-Page-Draft-for-upcoming-release-tp5796494p5796505.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimum supported version of Python?

2014-03-17 Thread David Johnston

Peter Eisentraut-2 wrote
On Sat, 2014-03-15 at 20:55 -0400, Tom Lane wrote:
Our documentation claims that the minimum Python version for plpython
is 2.3. However, an attempt to build with that on an old Mac yielded
a bunch of failures in the plpython_types regression test,

It has frequently been the case that the last supported version does not
fully pass the regression test, because of the overhead of maintaining
variant files. The last supported version is the one that compiles and
works. You will note that 2.2 no longer compiles. (It also failed the
regression tests for a while before it started not compiling.)
Typically, versions fall out of support because we add new functionality
that the old Python versions cannot support anymore.

all of the form

! ERROR: could not import a module for Decimal constructor
! DETAIL: ImportError: No module named decimal

You can make this work by manually installing the decimal module
(because it was not part of the core in Python 2.3). Otherwise, this
test result legitimately alerts you that some feature is not fully
working and that you need to adjust your installation.

It would seem a single error (or warning, depending) like Missing
Dependency at the start and skipping all tests dependent on the error would
be cleaner. So that if someone goes an blindly starts running a clean 2.3
regression test they are not drowned in a sea of errors and no clear
documentation as to why and how to resolve.

Now, odds are the assumption that Decimal is present is spread through the
code-base so maybe something less invasive?

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Minimum-supported-version-of-Python-tp5796175p5796512.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimum supported version of Python?

2014-03-17 Thread David Johnston

Joshua D. Drake wrote
On 03/17/2014 07:31 PM, Peter Eisentraut wrote:

On Sun, 2014-03-16 at 22:34 -0400, Tom Lane wrote:
Well, if you want to consider python 2.3 as supported, I have a
buildfarm
machine I am about to put online that has 2.3 on it. If I spin it up
with
python enabled, I expect you to see to it that it starts passing. If
you
won't do that, I'm going to change the documentation.

As I said, according to my testing, 2.3 is supported. If your
experience is different, then please submit a reproducible bug report.

As for 2.4 vs 2.5, I don't have a lot of faith that we're really
supporting anything that's not represented in the buildfarm...

There are many other features that the build farm doesn't test and that
I don't have a lot of faith in, but I'm not proposing to remove those.
I don't control what the build farm tests, I only control my own work.

We shouldn't be supporting anything the community doesn't support.
Python 2.3 is dead. We shouldn't actively support it nor suggest that we
could or should via the docs.

There is certainly an argument for Python 2.4 (due to CentOS/RHEL) but
other than that... really?

+1 generally though it ignores the bigger question which is how are we
actually defining and proving support; and also the whole we support 8.4
(for a little longer now) so what related technology should we be either
supporting or at least saying ya know, this was working just fine 4 years
ago if you use exactly these versions, so use those versions. A last know
working version in the plpythonu section wouldn't hurt since actively
running all the possible combinations for the 5 year support cycle doesn't
make sense.

Is Peter /the/ final arbiter of fact in this area or are there some public
buildfarm animals out there that have been designated to fulfill this role?

Peter claims to be doing the compiling and testing for these, which is great
and would be better if people like Tom actually were aware of that fact, but
comments suggest that such testing is currently done offline.

To Peter's comment - reporting which version combinations are known to
compile is good but if a regression test exists then it should either pass
on a clean version or it should be documented (ideally right in the tests
themselves) what additional configuration is required (though maybe moving
that back to configure would be sufficient for our needs). If there are no
coverage tests we are not asserting any proof anyway so if a release
compiles and passes regression tests it should be noted as such. Once the
regressions start failing without any known fix it's hard to understand how
we could say such a combination is valid/supported anymore...

Supported - we will endeavor to fix bugs. add the test case and then fix
the problem.

Compiles/Tests-OK - we know it worked at one point but if we discover a bug
we're more than likely to just say that the version is no longer supported
and known to have bugs. Add the test case that causes regression to fail
and move on.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Minimum-supported-version-of-Python-tp5796175p5796516.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is this a bug

2014-03-13 Thread David Johnston

fabriziomello wrote
 On Thu, Mar 13, 2014 at 10:34 AM, Euler Taveira lt;

 euler@.com

 gt;
 wrote:

 On 13-03-2014 00:11, Fabrízio de Royes Mello wrote:
  Shouldn't the ALTER statements below raise an exception?
 
 For consistency, yes. Who cares? I mean, there is no harm in resetting
 an unrecognized parameter. Have in mind that tighten it up could break
 scripts. In general, I'm in favor of validating things.

 
 I know this could break scripts, but I think a consistent behavior should
 be raise an exception when an option doesn't exists.
 
 euler@euler=# reset noname;
 ERROR:  42704: unrecognized configuration parameter noname
 LOCAL:  set_config_option, guc.c:5220

 
 This is a consistent behavior.
 
 Regards,

Probably shouldn't back-patch but a fix and release comment in 9.4 is
warranted.

Scripts resetting invalid parameters are probably already broken, they just
haven't discovered their mistake yet.

Do we need an IF EXISTS feature on these as well? ;)

David J.







--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Is-this-a-bug-tp5795831p5795943.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2014-03-12 Thread David Johnston

Tom Lane-2 wrote
Unfortunately, while testing it I noticed that there's a potentially
fatal backwards-compatibility problem, namely that the COPY n status
gets printed on stdout, which is the same place that COPY OUT data is
going. While this isn't such a big problem for interactive use,
usages like this one are pretty popular:

psql -c 'copy mytable to stdout' mydatabase | some-program

With the patch, COPY n gets included in the data sent to some-program,
which never happened before and is surely not what the user wants.
The same if the -c string uses \copy.

There are several things we could do about this:

1. Treat this as a non-backwards-compatible change, and document that
people have to use -q if they don't want the COPY tag in the output.
I'm not sure this is acceptable.

I've mostly used copy to with files and so wouldn't mind if STDOUT had the
COPY n sent to it as long as the target file is just the copy contents.

2. Kluge ProcessResult so that it continues to not pass back a PGresult
for the COPY TO STDOUT case, or does so only in limited circumstances
(perhaps only if isatty(stdout), for instance).

The main problem with this is that people will test by sending output to a
TTY and see the COPY n. Although if it can be done consistently then you
minimize backward incompatibility and encourage people to enforce quiet mode
while the command runs...

3. Modify PrintQueryStatus so that command status goes to stderr not
stdout. While this is probably how it should've been done in the first
place, this would be a far more severe compatibility break than #1.
(For one thing, there are probably scripts out there that think that any
output to stderr is an error message.) I'm afraid this one is definitely
not acceptable, though it would be by far the cleanest solution were it
not for compatibility concerns.

Yes, it's a moot point but I'm not sure it would be best anyway.

4. As #3, but print the command status to stderr only if it's COPY n,
otherwise to stdout. This is a smaller compatibility break than #3,
but still a break since COPY status was formerly issued to stdout
in non TO STDOUT/FROM STDIN cases. (Note that PrintQueryStatus can't
tell whether it was COPY TO STDOUT rather than any other kind of COPY;
if we want that to factor into the behavior, we need ProcessResult to
do it.)

Since we are considering stderr my (inexperienced admittedly) gut says that
using stderr for this is generally undesirable and especially given our
existing precedence. stdout is the seemingly correct target, typically, and
the existing quiet-mode toggle provides sufficient control for typical
needs.

5. Give up on the print-the-tag aspect of the change, and just fix the
wrong-line-number issue (so we'd still introduce the copyStream variable,
but not change how PGresults are passed around).

I'm inclined to think #2 is the best answer if we can't stomach #1.
But the exact rule for when to print a COPY OUT result probably
still requires some debate. Or maybe someone has another idea?

Also, I'm thinking we should back-patch the aspects of the patch
needed to fix the wrong-line-number issue. That appears to have been
introduced in 9.2; older versions of PG get the above example right.

Comments?

I'd like COPY TO to anything but STDOUT to emit a COPY n on STDOUT -
unless suppressed by -q(uiet)

Document that COPY TO STDOUT does not emit COPY n because STDOUT is
already assigned for data and so is not available for notifications. Since
COPY is more typically used for ETL than a bare-select, in addition to
back-compatibility concerns, this default behavior seems reasonable.

Would it be possible to store the n somewhere and provide a command - like
GET DIAGNOSTICS in pl/pgsql - if the user really wants to know how many rows
were sent to STDOUT? I'm doubt this is even useful in the typical use-case
for COPY TO STDOUT but figured I'd toss the idea out there.

Is there anything besides a desire for consistency that anyone has or can
put forth as a use-case for COPY TO STDOUT emitting COPY n on STDOUT as
well? If you are going to view the content inline, and also want a quick
count, ISTM you would be more likely to use SELECT to take advantage of all
its pretty-print features.

If we really need to cater to this use then maybe a --loud-copy-to-stdout
switch can be provided to override its default quiet-mode.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/COPY-table-FROM-STDIN-doesn-t-show-count-tag-tp5775018p5795611.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Bug: Fix Wal replay of locking an updated tuple (WAS: Re: [HACKERS] 9a57858f1103b89a5674f0d50c5fe1f756411df6)

2014-03-12 Thread David Johnston

Joshua D. Drake wrote
 On 03/12/2014 06:15 PM, Tom Lane wrote:
 Robert Haas lt;

 robertmhaas@

 gt; writes:
 Discuss.

 This thread badly needs a more informative Subject line.

 
 No kidding. Or at least a link for goodness sake. Although the 
 pgsql-packers list wasn't all that helpful either.

A link would be nice though if -packers is a security list then that may not
be a good thing since -hackers is public...

A quick search of Nabble and the Mailing Lists section of the homepage do
not indicate pgsql-packers exists - at least not in any publicly (even if
read-only) accessible way.

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/9a57858f1103b89a5674f0d50c5fe1f756411df6-tp5795816p5795827.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] The case against multixact GUCs

2014-03-11 Thread David Johnston

Josh Berkus wrote
Hackers,

In the 9.3.3 updates, we added three new GUCs to control multixact
freezing. This was an unprecented move in my memory -- I can't recall
ever adding a GUC to a minor release which wasn't backwards
compatibility for a security fix. This was a mistake.

It probably should have been handled better but the decision to make these
parameters visible itself doesn't seem to be the wrong decision - especially
when limited to a fairly recently released back-branch.

What makes these GUCs worse is that nobody knows how to set them; nobody
on this list and nobody in the field. Heck, I doubt 1 in 1000 of our
users (or 1 in 10 people on this list) know what a multixact *is*.

That isn't a reason in itself to not have the capability if it is actually
needed.

Further, there's no clear justification why these cannot be set to be
the same as our other freeze ages (which our users also don't
understand), or a constant calculated portion of them, or just a
constant. Since nobody anticipated someone adding a GUC in a minor
release, there was no discussion of this topic that I can find; the new
GUCs were added as a side effect of fixing the multixact vacuum issue.
Certainly I would have raised a red flag if the discussion of the new
GUCs hadn't been buried deep inside really long emails.

The release documentation makes a pointed claim that the situation WAS that
the two were identical; but the different consumption rates dictated making
the multi-xact configuration independently configurable. So in effect the
GUC was always present - just not user-visible.

Even if there are not any current best practices surrounding this topic at
least this way as methods are developed there is an actual place to put the
derived value. As a starting point one can simply look at the defaults and,
if they have change the value for the non-multi value apply the same factor
to the custom multi-version.

Now, obviously someone has to think to actually do that - and the release
notes probably should have provided such guidance - but as I state
explicitly below the issue is more about insufficient communication and
education and less about providing the flexibility.

Adding new GUCs which nobody has any idea how to set, or can even
explain to new users, is not a service to our users. These should be
removed.

Or we should insist that those few that do have an understanding create some
kind of wiki document, or even a documentation section, to educate those
that are not as knowledgeable in this area.

For good reason much of the recent focus in this area has been actually
getting the feature to work. Presuming that it is a desirable feature -
which it hopefully is given it made it into the wild - to have then such
focus has obviously been necessary given the apparent complexity of this
feature (as evidenced by the recent serious bug reports) but hopefully the
feature itself is mostly locked down and education will begin.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/The-case-against-multixact-GUCs-tp5795561p5795573.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a temporary measure

2014-03-11 Thread David Johnston

Andrew Dunstan wrote
On 03/11/2014 11:50 PM, Jaime Casanova wrote:
On Tue, Mar 11, 2014 at 10:06 PM, Tom Lane lt;

tgl@.pa

gt; wrote:
But not sure how to define a unique
index that allows (joe, db1) to coexist with (joe, db2) but not with
(joe, 0).

and why you want that restriction? when you login you need to specify
the db, right? if you don't specify the db then is the global 'joe'
that want to login

You can't login without specifying a db. Every session is connected to a
db.

cheers

andrew

Random Thoughts:

So if dave is already a user in db1 only that specific dave can be made a
global user - any other dave would be disallowed.

Would user - password be a better PK? Even with the obvious issue that
password part of the key can change. user-password to database is a
properly many-to-many relationship. Or see next for something simpler.

A simple implementation would simply have the global users copied into each
database as it is constructed. There would also be a link from each of the
database-specific users and the global master so that a password change
issued against the global user propagates to all the database-specific
versions.

Construction of a new global user would cause an immediate copy-over; which
can fail if the simple name is already in use in any of the existing
databases. Promoting becomes a problem that would ideally have a solution
- but then you are back to needing to distinguish between dave-db1 and
dave-db2...

Be nice if all users could be global and there would be some way to give
them permissions on databases.

Or go the opposite extreme and all users are local and the concept of
global would have to be added by those desiring it. Basically move
user-management at the cluster level to a cluster support application and
leave databases free-standing.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/db-user-namespace-a-temporary-measure-tp5795305p5795609.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CREATE TYPE similar CHAR type

2014-03-06 Thread David Johnston

mohsencs wrote
 I want use CREATE TYPE to create one type similar to char.
 I want to when I create type, then my type behave similar to char:
 
 CREATE TABLE test (oneChar char);
 
 when I want insert one column with length1 to it, so it gets this error:
 ERROR:  value too long for type character(1)
 
 I want my type behave similar this but it behaves similar varchar type.

If you can get over the need for using CREATE TYPE you'll find that using
CREATE DOMAIN with a check constraint will probably meet your needs
perfectly.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/CREATE-TYPE-similar-CHAR-type-tp5794946p5794981.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Equivalence Rules

2014-02-28 Thread David Johnston

Ali Piroozi wrote
 Hi
 Which equivalence rule from those are listed in
 email's attachment are implemented in postgresql?
 where are them?

What do you mean by where?

The various JOINS and UNION/INTERSECT/DIFFERENCE are all defined
capabilities.

SQL is not purely relational in nature so some of the mathematical rules may
not be exactly implemented as theory would dictate but largely they are
present.

I would suggesting providing more context as to why you are asking this as
people are more likely to provide help at this level of technical depth if
they know why; and can frame their answers more appropriately.

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Equivalence-Rules-tp5794035p5794069.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Function sugnature with default parameter

2014-02-26 Thread David Johnston

salah jubeh wrote
 Hello,
 
 I find default values confusing when a function is overloaded, below is an
 example. 
 
 
 CREATE OR REPLACE FUNCTION default_test (a INT DEFAULT 1, b INT DEFAULT 1,
 C INT DEFAULT 1) RETURNS INT AS
 $$
     BEGIN
         RETURN a+b+c;
     END;
 $$
 LANGUAGE 'plpgsql';

Provide a real use-case for this need and maybe someone will consider it
worthwhile to implement.  

In the meantime use VARIADIC or define only the longest-default signature
and provide reasonable default values for all optional arguments.  In this
case the defaults should be zero, not one.

Or don't make any of the arguments DEFAULT and provide as many overloads as
you use.  The whole point of DEFAULT was to avoid having to do this for
simple cases - you are not compelled to use default values if your use-case
is too complex for them.

Or just avoid overloading and use different names that describe better what
the different versions accomplish.

Specifying DEFAULT in the function call seems to defeat the point.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Function-sugnature-with-default-parameter-tp5793737p5793739.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simplified VALUES parameters

2014-02-26 Thread David Johnston

Leon Smith wrote
Hi, I'm the maintainer and a primary author of a postgresql client
library
for Haskell, called postgresql-simple, and I recently investigated
improving support for VALUES expressions in this library. As a result,
I'd
like to suggest two changes to postgresql:

1. Allow type specifications inside AS clauses, for example

(VALUES (1,'hello'),(2,'world')) AS update(x int, y text)

2. Have an explicit syntax for representing VALUES expressions which
contain no rows, such as VALUES (). (although the precise syntax isn't
important to me.)

My claim is that these changes would make it simpler for client libraries
to properly support parameterized VALUES expressions. If you care, I've
included a postscript including a brief background, and a link to my
analysis and motivations.

At a high-level I don't see how the nature of SQL would allow for either of
these things to work. The only reason there even is (col type, col2 type)
syntax is because record-returning functions have to have their return type
defined during query construction. The result of processing a VALUES clause
has to be a normal relation - the subsequent presence of AS simply provides
column name aliases because in the common form each column is assigned a
generic name during execution.

Defining a generic empty-values expression has the same problem in that you
have to define how many, with type and name, columns the VALUES expression
needs to generate.

From what I can see SQL is not going to readily allow for the construction
of virtual tables via parameters. You need either make those tables
non-virtual (even if temporary) or consolidate them into an ARRAY. In short
you - the client library - probably can solve the virtual table problem but
you will have to accommodate user-specified typing somehow in order to
supply valid SQL to the server.

The two common solutions for your specified use-case are either the user
creates the needed temporary table and writes the update query to join
against that OR they write the generic single-record update statement and
then loop over all desired input values - ideally all done within a
transaction. In your situation you should automate that by taking your
desired syntax and construct a complete script that can then been sent to
PostgreSQL.

I don't imagine that the need for dynamically specified virtual tables is
going to be strong enough for people to dedicate the amount of resources it
would take to implement such a capability.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Simplified-VALUES-parameters-tp5793744p5793756.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 'dml' value for log_statement

2014-02-06 Thread David Johnston

Sawada Masahiko wrote
Hi all,

Attaching patch provides new value 'dml' for log_statement.
Currently, The server logs modification statements AND data definition
statements if log_statement is set 'mod'.
So we need to set the 'all' value for log_statement and remove
unnecessary information
if we would like to log only DML.

This patch allows the user to set in detail the setting.
The server logs statement only when INSERT. UPDATE, DELETE and TRUNCATE
statement are executed.
( same as 'mod' value which does not log the DDL)

Comments?

Regards,

---
Sawada Masahiko

-1

I'm not fully versed on what log levels provide what data but if you care
about changes to data then ALTER TABLE table is just as important an
activity as UPDATE table so mod exhibits the recommended behavior and
dml provides something that should be of minimal value.

DML by itself has value since monitoring schema changes without worrying
about transient data updates is understandable. But a schema-change, by
definition, alters data.

Maybe further description of why you find mod unsatisfactory would be
helpful.

Though it is simple enough to just let people use their own judgement

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/dml-value-for-log-statement-tp5790895p5790925.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Patch: regexp_matches variant returning an array of matching positions

2014-01-28 Thread David Johnston

Alvaro Herrera-9 wrote
 Björn Harrtell wrote:
 I've written a variant of regexp_matches called regexp_matches_positions
 which instead of returning matching substrings will return matching
 positions. I found use of this when processing OCR scanned text and
 wanted
 to prioritize matches based on their position.
 
 Interesting.  I didn't read the patch but I wonder if it would be of
 more general applicability to return more info in a fell swoop a
 function returning a set (position, length, text of match), rather than
 an array.  So instead of first calling one function to get the match and
 then their positions, do it all in one pass.
 
 (See pg_event_trigger_dropped_objects for a simple example of a function
 that returns in that fashion.  There are several others but AFAIR that's
 the simplest one.)

Confused as to your thinking. Like regexp_matches this returns SETOF
type[].  In this case integer but text for the matches.  I could see adding
a generic function that returns a SETOF named composite (match varchar[],
position int[], length int[]) and the corresponding type.  I'm not imagining
a situation where you'd want the position but not the text and so having to
evaluate the regexp twice seems wasteful.  The length is probably a waste
though since it can readily be gotten from the text and is less often
needed.  But if it's pre-calculated anyway...

My question is what position is returned in a multiple-match situation? The
supplied test only covers the simple, non-global, situation.  It needs to
exercise empty sub-matches and global searches.  One theory is that the
first array slot should cover the global position of match zero (i.e., the
entire pattern) within the larger document while sub-matches would be
relative offsets within that single match.  This conflicts, though, with the
fact that _matches only returns array elements for () items and never for
the full match - the goal in this function being parallel un-nesting. But as
nesting is allowed it is still possible to have occur.

How does this resolve in the patch?

SELECT regexp_matches('abcabc','((a)(b)(c))','g');

David J.







--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Patch-regexp-matches-variant-returning-an-array-of-matching-positions-tp5789321p5789414.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Patch: regexp_matches variant returning an array of matching positions

2014-01-28 Thread David Johnston

Erik Rijkers wrote
 On Wed, January 29, 2014 05:16, David Johnston wrote:

 How does this resolve in the patch?

 SELECT regexp_matches('abcabc','((a)(b)(c))','g');

 
 With the patch:
 
 testdb=# SELECT regexp_matches('abcabc','((a)(b)(c))','g'),
 regexp_matches_positions('abcabc','((a)(b)(c))');
  regexp_matches | regexp_matches_positions
 +--
  {abc,a,b,c}| {1,1,2,3}
  {abc,a,b,c}| {1,1,2,3}
 (2 rows)

The {1,1,2,3} in the second row is an artifact/copy from
set-value-function-in-select-list repetition and has nothing to do with the
second match.


 testdb=# SELECT regexp_matches('abcabc','((a)(b)(c))','g'),
 regexp_matches_positions('abcabc','((a)(b)(c))', 'g');
  regexp_matches | regexp_matches_positions
 +--
  {abc,a,b,c}| {1,1,2,3}
  {abc,a,b,c}| {4,4,5,6}
 (2 rows)

As expected.

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Patch-regexp-matches-variant-returning-an-array-of-matching-positions-tp5789321p5789434.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier

2014-01-06 Thread David Johnston

Andres Freund-3 wrote
On 2014-01-06 09:12:03 -0800, Mark Dilger wrote:
The reason I was going to all the trouble of creating
chrooted environments was to be able to replicate
clusters that have tablespaces. Not doing so makes
the test code simpler at the expense of reducing
test coverage.

I am using the same binaries. The chroot directories
are not chroot jails. I'm intentionally bind mounting
out to all the other directories on the system, except
the other clusters' data directories and tablespace
directories. The purpose of the chroot is to make the
paths the same on all clusters without the clusters
clobbering each other.

I don't think the benefit of being able to test tablespaces without
restarts comes even close to offsetting the cost of requiring sudo
permissions and introducing OS dependencies. E.g. there's pretty much no
hope of making this work sensibly on windows.

So I'd just leave out that part.

Only skimming this thread but even if only a handful of buildfarm animals
can run this extended test bundle because of the restrictive requirements it
is likely better than discarding them altogether. The main thing in this
case is to segregate out this routine so that it has to be invoked
explicitly and ideally in a ignore if pre-reqs are missing manner.

Increasing the likelihood and frequency of test runs in what is a fairly
popular platform and that covers non-OS specific code as well has benefits.
As long at it doesn't poison anything else I don't see that much harm coming
of it.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Re-In-core-regression-tests-for-replication-cascading-archiving-PITR-etc-Michael-Paquier-tp5785400p578.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Fixing bug #8228 (set-valued function called in context that cannot accept a set)

2014-01-06 Thread David Johnston

Tom Lane-2 wrote
 I kinda forgot about this bug when I went off on vacation:
 http://www.postgresql.org/message-id/

 E1UnCv4-0007oF-Bo@.postgresql

Just to clarify:

This patch will cause both executions of the example query to fail with the
set-valued function... error.

Also, the reason the ::varchar one did not fail was because not cast
function was ever called but the ::varchar(30) forced a function call and
thus prompted the error when the second record and resultant regexp_matches
expression was encountered.

Thanks!

David J.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Fixing-bug-8228-set-valued-function-called-in-context-that-cannot-accept-a-set-tp5785622p5785629.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Fixing bug #8228 (set-valued function called in context that cannot accept a set)

2014-01-06 Thread David Johnston

David Johnston wrote
 
 Tom Lane-2 wrote
 I kinda forgot about this bug when I went off on vacation:
 http://www.postgresql.org/message-id/

 E1UnCv4-0007oF-Bo@.postgresql

 
 Just to clarify:
 
 This patch will cause both executions of the example query to fail with
 the set-valued function... error.
 
 Also, the reason the ::varchar one did not fail was because not cast
 function was ever called but the ::varchar(30) forced a function call
 and thus prompted the error when the second record and resultant
 regexp_matches expression was encountered.
 
 Thanks!
 
 David J.

Sorry for the imprecise English.  I believe both of the following items but
would like confirmation/clarification of my understanding.  

The whole varchar/varchar(30) discrepancy is bothersome and since the
example forces a function-call via the use of lower(...), and doesn't test
the non-function situation, I am concerned this patch is incorrect.

If the first item is not true, i.e. this patch makes both alternatives work,
then I think the wrong solution was chosen - or at least not fully vetted.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Fixing-bug-8228-set-valued-function-called-in-context-that-cannot-accept-a-set-tp5785622p5785631.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Fixing bug #8228 (set-valued function called in context that cannot accept a set)

2014-01-06 Thread David Johnston

Tom Lane-2 wrote
 David Johnston lt;

 polobo@

 gt; writes:
 The whole varchar/varchar(30) discrepancy is bothersome and since the
 example forces a function-call via the use of lower(...), and doesn't
 test
 the non-function situation, I am concerned this patch is incorrect.
 
 The reason casting to varchar(30) fails is that that results in invocation
 of a function (to enforce the length limit).  Casting to varchar is just a
 RelabelType operation, which doesn't have two different code paths for
 set and not-set inputs.
 
 I did check the patch against your original example, but I thought using
 lower() made the purpose of the regression test case more apparent.

Yeah, I caught that part.  My focus was on the non-function version.

Not being able to apply the patch and test myself it sounds like you likely
made the function-invocation version succeed along with the original
re-label-only version.

I guess backward-compatibility concerns forces this solution but after
thinking through what was happening I was leaning more toward making both
queries fail.  An SRF in a CASE expression seems like a foot-gun to me.  SRF
in the select-list is someone of a foot-gun too but understandable given the
recency of the addition of LATERAL to our toolbox.

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Fixing-bug-8228-set-valued-function-called-in-context-that-cannot-accept-a-set-tp5785622p5785636.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] trailing comment ghost-timing

2013-12-23 Thread David Johnston

Andreas Karlsson wrote
On 12/24/2013 02:05 AM, Erik Rijkers wrote:
With \timing on, a trailing comment yields a timing.

# test.sql
select 1;

/*
select 2
*/

$ psql -f test.sql
?column?
--
1
(1 row)

Time: 0.651 ms
Time: 0.089 ms

I assume it is timing something about that comment (right?).

Confusing and annoying, IMHO. Is there any chance such trailing
ghost-timings can be removed?

This seems to be caused by psql sending the comment to the server to
evaluate as a query. I personally think timing should always output
something for every command sent to the server. To fix this problem I
guess one would have to modify psql_scan() to check if a scanned query
only contains comments and then not send it to the server at all.

The minimal file to reproduce it is:

/**/

--
Andreas Karlsson

I need to be convinced that the server should not just silently ignore
trailing comments. I'd consider an exception if the only text sent is a
comment ( in such a case we should throw an error ) but if valid commands
are sent and there is just some comment text at the end it should be ignored
the same as if the comments were embedded in the middle of the query text.

I've encountered other clients that output phantom results in this situation
and solving it at the server seems worthwhile so client applications do not
have to care.

In the example case, I think, putting the comment before the command results
in only one timing. This inconsistency is a symptom of this situation being
handled incorrectly.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/trailing-comment-ghost-timing-tp5784473p5784484.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] array_length(anyarray)

2013-12-18 Thread David Johnston

Marko Tiikkaja-4 wrote
 On 2013-12-18 22:32, Andrew Dunstan wrote:
 You're not really free to assume it - you'll need an exception handler
 for the other-than-1 case, or your code might blow up.

 This seems to be codifying a bad pattern, which should be using
 array_lower() and array_upper() instead.
 
 That's the entire point -- I *want* my code to blow up.  If someone 
 passes a multi-dimensional array to a function that assumes its input is 
 one-dimensional and its indexes start from 1, I want it to be obvious 
 that the caller did something wrong.  Now I either copy-paste lines and 
 lines of codes to always test for the weird cases or my code breaks in 
 subtle ways.
 
 This is no different from an Assert() somewhere -- if the caller breaks 
 the documented interface, it's his problem, not mine.  And I don't want 
 to waste my time coding around the fact that this simple thing is so 
 hard to do in PG.

1) Why cannot we just make the second argument of the current function
optional and default to 1?
2) How about providing a function that returns the 1-dim/lower=1 input
array or raise/exception if the input array does not conform?

not tested/psuedo-code
CREATE FUNCTION array_normal(arr anyarray) RETURNS anyarray
$$
begin
if (empty(arr)) return arr;
if (ndim(arr)  1) raise exception;
if (array_lower()  1) raise exception
return arr;
end;
$$

I can also see wanting 1-dimensional enforced without having to require the
lower-bound to be 1 so maybe a separate function for that.

Usage:

SELECT array_length(array_normal(input_array))

I could see this being especially useful for a domain and/or column
constraint definition and also allowing for a textbook case of separation of
concerns.

I am torn, but mostly opposed, to making an array_length(anyarray) function
with these limitations enforced - especially if other similar functions are
not created at the same time.  I fully agree that array_length(anyarray)
should be a valid call without requiring the user to specify , 1 by rote.

Tangential Question:

Is there any way to define a non-1-based array without using array-literal
syntax but by using ARRAY[1,2,3] syntax?


David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/array-length-anyarray-tp5783950p5783972.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-07 Thread David Johnston

MauMau wrote
From: David Johnston lt;

polobo@

gt;
5. FATAL: terminating walreceiver process due to administrator command
6. FATAL: terminating background worker \%s\ due to administrator
command
5 and 6: I don't fully understand when they would happen but likely fall
into the same the DBA should know what is going on with their server and
confirm any startup/shutdown activity it is involved with.

They might be better categorized NOTICE level if they were in response
to
a administrator action, versus in response to a crashed process, but even
for the user-initiated situation making sure they hit the log but using
FATAL is totally understandable and IMO desirable.

#5 is output when the DBA shuts down the replication standby server.
#6 is output when the DBA shuts down the server if he is using any custom
background worker.
These messages are always output. What I'm seeing as a problem is that
FATAL messages are output in a normal situation, which worries the DBA,
and
those messages don't help the DBA with anything. They merely worry the
DBA.

While worry is something to be avoided not logging messages is not the
correct solution. On the project side looking for ways and places to better
communicate to the user is worthwhile in the absence of adding additional
complexity to the logging system. The user/DBA, though, is expected to
learn how the proces works, ideally in a testing environment where worry is
much less a problem, and understand that some of what they are seeing are
client-oriented messages that they should be aware of but that do not
indicate server-level issues by themselves.

They should probably run the log through a filter (a template of which the
project should provide) that takes the raw log and filters out those things
that, relative to the role and context, are really better classified as
NOTICE even if the log-level is shown as FATAL.

Raw logs should be verbose so that tools can have more data with which to
make decisions. You can always hide what is provided but you can never show
was has been omitted. I get that people wil scan raw logs but targeting
that lowest-denominator isn't necessarily the best thing to do. Instead,
getting those people on board with better practices and showing (and
providing) them helpful tools should be the goal.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/RFC-Shouldn-t-we-remove-annoying-FATAL-messages-from-server-log-tp5781899p5782267.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RFC: programmable file format for postgresql.conf

2013-12-06 Thread David Johnston

Álvaro Hernández Tortosa wrote
Note that you are not required to maintain your configuration data in a
postgresql.conf-formatted file. You can keep it anywhere you like, GUI
around in it, and convert it back to the required format. Most of the

I think it is not a very good idea to encourage GUI tools or tools to
auto-configure postgres to use a separate configuration file and then
convert it to postgresql.conf. That introduces a duplicity with evil
problems if either source of data is modified out-of-the-expected-way.

That's why I'm suggesting a config file that is, at the same time,
usable by both postgres and other external tools. That also enables
other features such as editing the config file persistently through a
SQL session.

For my money I'd rather have a single file and/or directory-structure where
raw configuration settings are saved in the current 'key = value' format
with simple comments allowed and ignored by PostgreSQL. And being simple
key-value the risk of out-of-the-expected-way changes would be minimal.

If you want to put an example configuration file out there, one that will
not be considered to the true configuration, with lots of comments and
meta-data then great. I'm hoping that someday there is either a
curses-based and even full-fledged GUI that beginners can use to generate
the desired configuration.

If we want to put a separate configuration meta-data file out there to
basically provide a database from which third-party tools can pull out this
information then great. I would not incorporate that same information into
the main PostgreSQL configuration file/directory-structure. The biggest
advantage is that the meta-data database can be readily modified without any
concern regarding such changes impacting running systems upon update. Then,
tools simply need to import two files instead of one, link together the
meta-data key with the configuration key, and do whatever they were going to
do anyway.

If indeed that target audience is going to be novices then a static
text-based document is not going to be the most desirable interface to
present. At worse we should simply include a comment-link at the top of the
document to a web-page where an interactive tool for configuration file
creation would exist. That tool, at the end of the process, could provide
the user with text to copy-paste/save into a specified area on the server so
the customizations made would override the installed defaults.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/RFC-programmable-file-format-for-postgresql-conf-tp5781097p5782175.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WITHIN GROUP patch

2013-12-06 Thread David Johnston

Andrew Gierth wrote
 Tom == Tom Lane lt;

 tgl@.pa

 gt; writes:
 
   Please don't object that that doesn't look exactly like the syntax
   for calling the function, because it doesn't anyway --- remember
   you also need ORDER BY in the call.
 
  Tom Actually, now that I think of it, why not use this syntax for
  Tom declaration and display purposes:
 
  Tom type1, type2 ORDER BY type3, type4
 
  Tom This has nearly as much relationship to the actual calling
  Tom syntax as the WITHIN GROUP proposal does,
 
 But unfortunately it looks exactly like the calling sequence for a
 normal aggregate with an order by clause - I really think that is
 potentially too much confusion. (It's one thing not to look like
 the calling syntax, it's another to look exactly like a valid
 calling sequence for doing something _different_.)
 
 -- 
 Andrew (irc:RhodiumToad)

How about:

type1, type2 GROUP ORDER type3, type4

OR

GROUP type1, type2 ORDER type3, type4

David J.






--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Re-WITHIN-GROUP-patch-tp5773851p5782202.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-06 Thread David Johnston

MauMau wrote
From: Tom Lane lt;

tgl@.pa

gt;
There is no enthusiasm for a quick-hack solution here, and most people
don't actually agree with your proposal that these errors should never
get logged. So no, that is not happening. You can hack your local
copy that way if you like of course, but it's not getting committed.

Oh, I may have misunderstood your previous comments. I got the impression
that you and others regard those messages (except too many clients) as
unnecessary in server log.

1. FATAL: the database system is starting up
2. FATAL: the database system is shutting down
3. FATAL: the database system is in recovery mode

5. FATAL: terminating walreceiver process due to administrator command
6. FATAL: terminating background worker \%s\ due to administrator
command

Could you tell me why these are necessary in server log? I guess like
this.
Am I correct?

* #1 through #3 are necessary for the DBA to investigate and explain to
the
end user why he cannot connect to the database.

* #4 and #5 are unnecessary for the DBA. I can't find out any reason why
these are useful for the DBA.

For me 1-3 are unusual events in production situations and so knowing when
they occur, and confirming they occurred for a good reason, is a key job of
the DBA.

5 and 6: I don't fully understand when they would happen but likely fall
into the same the DBA should know what is going on with their server and
confirm any startup/shutdown activity it is involved with.

They might be better categorized NOTICE level if they were in response to
a administrator action, versus in response to a crashed process, but even
for the user-initiated situation making sure they hit the log but using
FATAL is totally understandable and IMO desirable.

I'd ask in what situations are these messages occurring so frequently that
they are becoming noise instead of useful data? Sorry if I missed your
use-case explanation up-thread.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/RFC-Shouldn-t-we-remove-annoying-FATAL-messages-from-server-log-tp5781899p5782234.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-06 Thread David Johnston

David Johnston wrote

MauMau wrote
From: Tom Lane lt;

tgl@.pa

Oh, I may have misunderstood your previous comments. I got the
impression
that you and others regard those messages (except too many clients) as
unnecessary in server log.

1. FATAL: the database system is starting up
2. FATAL: the database system is shutting down
3. FATAL: the database system is in recovery mode

5. FATAL: terminating walreceiver process due to administrator command
6. FATAL: terminating background worker \%s\ due to administrator
command

Could you tell me why these are necessary in server log? I guess like
this.
Am I correct?

* #1 through #3 are necessary for the DBA to investigate and explain to
the
end user why he cannot connect to the database.

* #4 and #5 are unnecessary for the DBA. I can't find out any reason why
these are useful for the DBA.
For me 1-3 are unusual events in production situations and so knowing when
they occur, and confirming they occurred for a good reason, is a key job
of the DBA.

They might be better categorized NOTICE level if they were in response
to a administrator action, versus in response to a crashed process, but
even for the user-initiated situation making sure they hit the log but
using FATAL is totally understandable and IMO desirable.

I'd ask in what situations are these messages occurring so frequently that
they are becoming noise instead of useful data? Sorry if I missed your
use-case explanation up-thread.

David J.

Went and scanned the thread:

PITR/Failover is not generally that frequent an occurrence but I will agree
that these events are likely common during such.

Maybe PITR/Failover mode can output something in the logs to alleviate user
angst about these frequent events? I'm doubting that a totally separate
mechanism can be used for this mode but instead of looking for things to
remove how about adding some additional coddling to the logs and the
beginning and end of the mode change?

Thought provoking only as I have not actually been a user of said feature.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/RFC-Shouldn-t-we-remove-annoying-FATAL-messages-from-server-log-tp5781899p5782235.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-06 Thread David Johnston

MauMau wrote
 From: Tom Lane lt;

 tgl@.pa

 gt;
 There is no enthusiasm for a quick-hack solution here, and most people
 don't actually agree with your proposal that these errors should never
 get logged.  So no, that is not happening.  You can hack your local
 copy that way if you like of course, but it's not getting committed.
 
 Oh, I may have misunderstood your previous comments.  I got the impression 
 that you and others regard those messages (except too many clients) as 
 unnecessary in server log.
 
 1. FATAL:  the database system is starting up

How about altering the message to tone down the severity by a half-step...

FATAL: (probably) not! - the database system is starting up

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/RFC-Shouldn-t-we-remove-annoying-FATAL-messages-from-server-log-tp5781899p5782236.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-05 Thread David Johnston

Tom Lane-2 wrote
MauMau lt;

maumau307@

gt; writes:
Shouldn't we lower the severity or avoiding those messages to server log?

No. They are FATAL so far as the individual session is concerned.
Possibly some documentation effort is needed here, but I don't think
any change in the code behavior would be an improvement.

1. FATAL: the database system is starting up
2. FATAL: the database system is shutting down
3. FATAL: the database system is in recovery mode
4. FATAL: sorry, too many clients already
Report these as FATAL to the client because the client wants to know the
reason. But don't output them to server log because they are not
necessary
for DBAs (4 is subtle.)

The notion that a DBA should not be allowed to find out how often #4 is
happening is insane.

Agreed #4 is definitely DBA territory.

ISTM that instituting some level of categorization for messages would be
helpful. Then logging and reporting frameworks would be able to identify
and segregate the logs in whatever way they and the configuration deems
appropriate.

FATAL: [LOGON] too many clients already

I'd make the category output disabled by default for a long while then
eventually enabled by default but leave the ability to disable. Calls that
do not supply a category get [N/A] output in category mode.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/RFC-Shouldn-t-we-remove-annoying-FATAL-messages-from-server-log-tp5781899p5781925.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WITHIN GROUP patch

2013-12-05 Thread David Johnston

Tom Lane-2 wrote
 Further questions about WITHIN GROUP:
 
 I believe that the spec requires that the direct arguments of an inverse
 or hypothetical-set aggregate must not contain any Vars of the current
 query level.  They don't manage to say that in plain English, of course,
 but in the 
 hypothetical set function
  case the behavior is defined in
 terms of this sub-select:
 
   ( SELECT 0, SK1, ..., SKK
 FROM TNAME UNION ALL
 VALUES (1, VE1, ..., VEK) )
 
 where SKn are the sort key values, TNAME is an alias for the input table,
 and VEn are the direct arguments.  Any input-table Vars the aggregate
 might refer to are thus in scope only for the SKn not the VEn.  (This
 definition also makes it clear that the VEn are to be evaluated once,
 not once per row.)  In the 
 inverse distribution function
  case, they
 resort to claiming that the value of the 
 inverse distribution function
 argument
  can be replaced by a numeric literal, which again makes it clear
 that it's supposed to be evaluated just once.
 
 So that's all fine, but the patch seems a bit confused about it.  If you
 try to cheat, you get an error message that's less than apropos:
 
 regression=# select cume_dist(f1) within group(order by f1) from text_tbl
 ;
 ERROR:  column text_tbl.f1 must appear in the GROUP BY clause or be used
 in an aggregate function
 LINE 1: select cume_dist(f1) within group(order by f1) from text_tbl...
  ^
 
 I'd have hoped for a message along the line of fixed arguments of
 cume_dist() must not contain variables, similar to the phrasing we
 use when you try to put a variable in LIMIT.
 
 Also, there are some comments and code changes in nodeAgg.c that seem
 to be on the wrong page here:
   
 +   /*
 +* Use the representative input tuple for any references to
 +* non-aggregated input columns in the qual and tlist.(If we
 are not
 +* grouping, and there are no input rows at all, we will come here
 +* with an empty firstSlot ... but if not grouping, there can't be
 any
 +* references to non-aggregated input columns, so no problem.)
 +* We do this before finalizing because for ordered set functions,
 +* finalize_aggregates can evaluate arguments referencing the
 tuple.
 +*/
 +   econtext-ecxt_outertuple = firstSlot;
 + 
 
 The last two lines of that comment are new, all the rest was moved from
 someplace else.  Those last two lines are wrong according to the above
 reasoning, and if they were correct the argument made in the pre-existing
 part of the comment would be all wet, meaning the code could in fact crash
 when trying to evaluate a Var reference in finalize_aggregates.
 
 So I'm inclined to undo this part of the patch (and anything else that's
 rearranging logic with an eye to Var references in finalize_aggregates),
 and to try to fix parse_agg.c so it gives a more specific error message
 for this case.

The original design references the spec as allowing a column reference if it
is a group by key:

Select cume_dist(f1) within group ( order by f1 ) from text_tbl group by f1

No example was shown where this would be useful...but nonetheless the
comment and error both presume that such is true.

I would presume the implementation would only supply rows for sorting from
the single group in question so for practical purposes the columns have a
constant value for the entirety of the evaluation and so this makes sense
and acts just like partition by on a window aggregate.

Question, are there any tests that exercise this desired behavior?  That
assuming agreement can be had that the group by behavior is indeed spec or
something custom we want to support.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Re-WITHIN-GROUP-patch-tp5773851p5782041.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add full object name to the tag field

2013-12-03 Thread David Johnston

Robert Haas wrote
On Mon, Dec 2, 2013 at 9:49 AM, Asit Mahato lt;

rigid.asit@

gt; wrote:
Hi all,

I am a newbie. I am unable to understand the to do statement given below.

Add full object name to the tag field. eg. for operators we need
'=(integer,
integer)', instead of just '='.

please help me out with an example.

Thanks and Regards,
Asit Mahato

Cast the OID of the operator to regoperator instead of regoper.

This seems too simple an answer to be useful, and utterly confusing to the
OP. The ToDo item in question is in the pg_dump/pg_restore section. In
would seem to be possibly referring to this e-mail thread:

Re: pg_dump sort order for functions

http://www.postgresql.org/message-id/flat/9837222c1001120712t1e64eb2fg9f502757b6002...@mail.gmail.com#9837222c1001120712t1e64eb2fg9f502757b6002...@mail.gmail.com

though there is no link to prior discussion attached to the ToDo item. The
thread itself suggests there was yet prior discussion on the topic though my
quick search did not turn anything up.

It seems that not all objects (though it appears functions are currently one
exception) are fully descriptive in their tag/name output which make
deterministic ordering more difficult. The goal is, say for operators, to
output not only the base operator symbol (regoper) but the types associated
with the left-hand and right-hand sides (regoperator). The additional type
information makes the entire name unique (barring cross-schema conflicts at
least, which can be mitigated) and allows for deterministic ordering.

Asit, hopefully this gives you enough context to ask better questions which
others can answer more fully. Also, it may help to give more details about
yourself and your goals and not just throw out that you are a newbie. The
later gives people little guidance in how they should structure their help.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Add-full-object-name-to-the-tag-field-tp5781167p5781431.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block

2013-11-28 Thread David Johnston

Robert Haas wrote

Issuing
command
ROLLBACK
/
outside of a transaction
block has the sole effect of emitting a warning.

Sure, that sounds OK.

...Robert

+1 for:

Issuing commandROLLBACK/ outside of a transaction
block has no effect except emitting a warning.

In all of these cases we are assuming that the user understands that
emitting a warning means that something is being logged to disk and thus is
causing a resource drain.

I like explicitly saying that issuing these commands is pointless/has no
effect; being indirect and saying that the only thing they do is emit a
warning omits any explicit explicit explanation of why. And while I agree
that logging the warning is an effect; but it is not the primary/direct
effect that the user cares about.

I would maybe change the above to:

*Issuing commandROLLBACK/ outside of a transaction block has no effect:
thus it emits a warning [to both user and log file].*

I do like thus instead of except due to the explicit causality link that
is establishes. We emit a warning because what you just did is pointless.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Suggestion-Issue-warning-when-calling-SET-TRANSACTION-outside-transaction-block-tp5743139p5780825.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bytea_ouput = escape vs encode(byte, 'escape')

2013-11-27 Thread David Johnston

Jim Nasby-2 wrote
 I'm wondering why bytes_output = escape produces different output than
 encode(byte, 'escape') does. Is this intentional? If so, why?
 
 cnuapp_prod@postgres=# select e'\r'::bytea AS cr, e'\n'::bytea AS lf;
   cr  |  lf  
 --+--
  \x0d | \x0a
 (1 row)
 
 cnuapp_prod@postgres=# set bytea_output = escape;
 SET
 cnuapp_prod@postgres=# select e'\r'::bytea AS cr, e'\n'::bytea AS lf;
   cr  |  lf  
 --+--
  \015 | \012
 (1 row)
 
 cnuapp_prod@postgres=# select encode(e'\r'::bytea,'escape') AS cr,
 encode(e'\n'::bytea, 'escape') AS lf;
  cr | lf 
 +
  \r |   +
 | 
 (1 row)
 
 cnuapp_prod@postgres=# 

encode takes a bytea and provides what it would be as a text (using the
specified encoding to perform the conversion).

the bytea output examples are simple output of the contents of the
byte-array without an supposition as to what those bytes represent.  It is
strictly a serialization format and not an encoding/decoding of the
contents.

In this example the two functions are acting as paired input/output.

I'm thinking the direction you are assuming from the word encode is
confusing you - as it did me at first.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/bytea-ouput-escape-vs-encode-byte-escape-tp5780643p5780647.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block

2013-11-26 Thread David Johnston

Bruce Momjian wrote
- Issuing
command
ABORT
/
when not inside a transaction does
- no harm, but it will provoke a warning message.
+ Issuing
command
ABORT
/
outside of a transaction block has no effect.

Those things are not the same.

Uh, I ended up mentioning no effect to highlight it does nothing,
rather than mention a warning. Would people prefer I say warning?
Or
should I say issues a warning because it has no effect or something?
It is easy to change.

I'd revert the change Robert highlights above. ISTM you've changed the
code to match the documentation; why would you then change the docs?

Well, I did it to make it consistent. The question is what to write for
_all_ of the new warnings, including SET. Do we say warning, do we
say it has no effect, or do we say both? The ABORT is a just one case
of that.

How about:

Issuing command outside of a transaction has no effect and will provoke a
warning.

I dislike does no harm because it can if someone thinks the current state
is different than reality.

It is good to indicate that a warning is emitted if this is done in error;
thus reinforcing the fact people should be looking at their warnings.

when not inside uses a negative modifier while outside is more direct
and thus easier to read, IMO. The phrase transaction block seems wordy so
I omitted the word block.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Suggestion-Issue-warning-when-calling-SET-TRANSACTION-outside-transaction-block-tp5743139p5780378.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] why semicolon after begin is not allowed in postgresql?

2013-11-26 Thread David Johnston

Josh Berkus wrote
 On 11/25/2013 03:36 PM, David Johnston wrote:
 Doh!
 
 IF / THEN / ELSE / ENDIF  (concept, not syntax)
 
 That also does help to reinforce the point being made here...
 
 David J.
 
 What point?

That the status-quo should be maintained.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/why-semicolon-after-begin-is-not-allowed-in-postgresql-tp5779905p5780425.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] why semicolon after begin is not allowed in postgresql?

2013-11-25 Thread David Johnston

AK wrote
 Kevin,
 
 I do see your logic now, but this thing is a common mistake - it means
 that this seems counter-intuitive to some people. What would happen if we
 applied Occam's razor and just removed this rule?
 
 All existing code would continue to work as is, and we would have one less
 rule to memorize. That would make PostgreSql a slightly better product,
 right?

I'm somewhat on the fence for this but am leaning toward maintaining
status-quo.  Mostly because of the analogy with IF ... END IF; versus the
SQL BEGIN; command which is a entirely separate construct.  

I would maybe change the documentation so that instead of simply dictating a
rule we explain why the syntax is the way it is - like this thread is doing. 
If they consciously omit the semi-colon hopefully they also understand that
what they are beginning is a code-block in plpgsql as opposed to an SQL
transaction.

That said, technical purity isn't always a good answer.  I'd be inclined to
let someone passionate enough about the idea implement it an critique
instead of dis-allowing it outright; but in the end that is likely to result
in the same end.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/why-semicolon-after-begin-is-not-allowed-in-postgresql-tp5779905p5780222.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] why semicolon after begin is not allowed in postgresql?

2013-11-25 Thread David Johnston

Mark Kirkwood-2 wrote
 Postgres supports many procedural languages (e.g plperl, plpython) and all
 these have different 
 grammar rules from SQL - and from each other. We can't (and shouldn't) 
 try altering them to be similar to SQL - it would defeat the purpose of 
 providing a procedural environment where the given language works as 
 advertised.
 
 So in the case of plpgsql - it needs to follow the Ada grammar, 
 otherwise it would be useless.

I do not follow the useless conclusion - what, present day, does Ada got
to do with it?  And the request is to alter only plpgsql, not all the other
languages.  To the casual end-user plpgsql is an internal language under
our full control and installed by default in all new releases.  Is it really
unreasonable to expect us to design in some level of coordination between it
and SQL?

Cross-compatibility is a valid reason though I'm guessing with all the
inherent differences between our standard PL and other database's PLs that
making this change would not be a materially noticeable additional
incompatibility.

I'll even accept language consistency and not worth the effort of
special-casing but mostly because the error is immediate and obvious, and
the solution is simple and readily learned.

A side observation: why does DECLARE not require a block-end keyword but
instead BEGIN acts as effectively both start and end?  BEGIN, IF, FOR,
etc... all come in pairs but DECLARE does not.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/why-semicolon-after-begin-is-not-allowed-in-postgresql-tp5779905p5780245.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] why semicolon after begin is not allowed in postgresql?

2013-11-25 Thread David Johnston

Andrew Dunstan wrote
 On 11/25/2013 06:13 PM, David Johnston wrote:

 A side observation: why does DECLARE not require a block-end keyword
 but
 instead BEGIN acts as effectively both start and end?  BEGIN, IF, FOR,
 etc... all come in pairs but DECLARE does not.


 A complete block is:
 
  [ DECLARE declarations ]
  BEGIN statements
  [ EXCEPTIONS handlers ]
  END
 
 The declare and exceptions parts are optional, as indicated. Does that 
 make it clearer?

Doh!

IF / THEN / ELSE / ENDIF  (concept, not syntax)

That also does help to reinforce the point being made here...

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/why-semicolon-after-begin-is-not-allowed-in-postgresql-tp5779905p5780250.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why is UPDATE with column-list syntax not implemented

2013-11-21 Thread David Johnston

AK wrote
 9.3 documentation says:
 
 According to the standard, the column-list syntax should allow a list of
 columns to be assigned from a single row-valued expression, such as a
 sub-select:
 
 UPDATE accounts SET (contact_last_name, contact_first_name) =
 (SELECT last_name, first_name FROM salesmen
  WHERE salesmen.id = accounts.sales_id);
 This is not currently implemented — the source must be a list of
 independent expressions.
 
 Why is this not implemented? Is it considered inconvenient to use, or
 difficult to implement. or not important enough, or some other reason?

I cannot answer why but I too would like to see this.  I actually asked this
a long while back but cannot seem to find my posting or recall the response.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Why-is-UPDATE-with-column-list-syntax-not-implemented-tp5779600p5779601.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] additional json functionality

2013-11-20 Thread David Johnston

Hannu Krosing-3 wrote
On 11/18/2013 06:49 PM, Josh Berkus wrote:
On 11/18/2013 06:13 AM, Peter Eisentraut wrote:
On 11/15/13, 6:15 PM, Josh Berkus wrote:
Thing is, I'm not particularly concerned about *Merlin's* specific use
case, which there are ways around. What I am concerned about is that we
may have users who have years of data stored in JSON text fields which
won't survive an upgrade to binary JSON, because we will stop allowing
certain things (ordering, duplicate keys) which are currently allowed
in
those columns. At the very least, if we're going to have that kind of
backwards compatibilty break we'll want to call the new version 10.0.
We could do something like SQL/XML and specify the level of validity
in a typmod, e.g., json(loose), json(strict), etc.
Doesn't work; with XML, the underlying storage format didn't change.
With JSONB, it will ... so changing the typemod would require a total
rewrite of the table. That's a POLS violation if I ever saw one
We do rewrites on typmod changes already.

To me having json(string) and json(hstore) does not seem too bad.

Three things:

1) How would this work in the face of functions that erase typemod
information?
2) json [no type mod] would have to effectively default to json(string)?
3) how would #1 and #2 interact?

I pondered the general idea but my (admittedly limited) gut feeling is that
using typemod would possibly be technically untenable and from an end-user
perspective would be even more confusing than having two types.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/additional-json-functionality-tp5777975p5779428.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WITH ORDINALITY versus column definition lists

2013-11-20 Thread David Johnston

Tom Lane-2 wrote
It seems to me that we don't really want this behavior of the coldeflist
not including the ordinality column. It's operating as designed, maybe,
but it's unexpected and confusing. We could either

1. Reinsert HEAD's prohibition against directly combining WITH ORDINALITY
with a coldeflist (with a better error message and a HINT suggesting that
you can get what you want via the TABLE syntax).

2. Change the parser so that the coldeflist is considered to include the
ordinality column, for consistency with the bare-alias case. We'd
therefore insist that the last coldeflist item be declared as int8, and
then probably have to strip it out internally.

#2 but I am hoping to be able to make the definition of the column optional.
One possibility is that if you do want to provide an alias you have to make
it clear that the coldeflist item in question is only valid for a with
ordinality column alias. Otherwise the entire coldeflist is used to alias
the record-type output and the ordinality column is provided its default
name.

Two options I came up with:

1) disallow any type specifier on the last item: t(f1 int, f2 text, o1)
2) add a new pseudo-type, ord: t(f1 int, f2 text, o1 ord)

I really like option #2. It makes it perfectly clear, entirely within the
coldeflist SQL, that the last column is different and in this case optional
both in the sense of providing an alias and also the user can drop the whole
ordinality aspect of the call as well. The system does not need to be told,
by the user, the actual type of the ordinality column. And given that I
would supposed most people would think to use int or bigint before using
int8 the usability there is improved once they need and then learn that to
alias the ordinality column they use the ord type which would internally
resolve to the necessary output type.

Option one is somewhat simpler but the slight added verbosity makes reading
the SQL coldeflist easier, IMO, since you are already scanning name-type
pairs and recognizing the missing type is, for me, harder than reading off
ord and recalling its meaning.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/WITH-ORDINALITY-versus-column-definition-lists-tp5779443p5779449.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WITH ORDINALITY versus column definition lists

2013-11-20 Thread David Johnston

Tom Lane-2 wrote
David Johnston lt;

polobo@

gt; writes:
Tom Lane-2 wrote
It seems to me that we don't really want this behavior of the coldeflist
not including the ordinality column. It's operating as designed, maybe,
but it's unexpected and confusing. We could either

1. Reinsert HEAD's prohibition against directly combining WITH
ORDINALITY
with a coldeflist (with a better error message and a HINT suggesting
that
you can get what you want via the TABLE syntax).

Two options I came up with:

1) disallow any type specifier on the last item: t(f1 int, f2 text, o1)
2) add a new pseudo-type, ord: t(f1 int, f2 text, o1 ord)

I really like option #2.

I don't. Pseudo-types have a whole lot of baggage. #1 is a mess too.
And in either case, making coldef list items optional increases the number
of ways to make a mistake, if you accidentally omit some other column for
instance.

I'll have to trust on the baggage/mess conclusion but if you can distinctly
and un-ambigiously identify the coldeflist item that is to be used for
ordinality column aliasing then the mistakes related to the
function-record-coldeflist are the same as now. There may be more (be still
quite few I would think) ways for the user to make a mistake but the syntax
ones are handled anyway and so if the others can be handled reasonably well
the UI for the feature becomes more friendly.

IOW, instead of adding int8 and ignoring it we poll the last item,
conditionally discard it (like the int8 case), then handle the possibly
modified structure as planned.

Basically the problem here is that it's not immediately obvious whether
the coldef list ought to include the ordinality column or not. The user
would probably guess not (since the system knows what type ordinality
should be).

Yes, if the column is not made optional somehow then I dislike option #2

The TABLE syntax is really a vastly better solution for this. So I'm
thinking my #1 is the best answer, assuming we can come up with a good
error message. My first attempt would be

ERROR: WITH ORDINALITY cannot be used with a column definition list
HINT: Put the function's column definition list inside TABLE() syntax.

Better ideas?

Works for me if #1 is implemented.

Just to clarify we are still allowing simple aliasing:

select * from generate_series(1,2) with ordinality as t(f1,f2);

Its only when the output of the function is record does the restriction of
placing the record-returning function call into TABLE (if you want ordinals)
come into play.

select * from table(array_to_set(array['one', 'two']) as (f1 int,f2 text))
with ordinality as t(a1,a2,a3);

If we could do away with having to re-specify the record-aliases in the
outer layer (a1, a2) then I'd be more understanding but I'm thinking that is
not possible unless you force a single-column alias definition attached to
WITH ORDINALITY to mean alias the ordinality column only.

On the plus side: anyone using record-returning functions is already dealing
with considerable verbosity so this extra bit doesn't seem to be adding that
much overhead; and since the alias - t(a1,a2,a3) - is optional if you don't
care about aliasing the with ordinal column the default case is not that
verbose (just add the surrounding TABLE).

I feel like I need a flow-chart for #1...

With #2 (w/ optional) you can add in an alias for the ordinality column
anyplace you would be specifying a coldeflist OR alias list. Favoring the
pseudo-type solution is the fact that given the prior sentence if you place
o1 ord in the wrong place it is possible to generate an error like with
ordinality not present for aliasing.

#1 is simpler to implement and does not preclude #2 in the future.

Possible #3?

Not sure if this is possible at this point but really the alias for the
ordinality column would be attached directly to the ordinality keyword.

e.g., ...) with ordinality{alias} as t(a1, a2)

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/WITH-ORDINALITY-versus-column-definition-lists-tp5779443p5779468.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WITH ORDINALITY versus column definition lists

2013-11-20 Thread David Johnston

Tom Lane-2 wrote
 David Johnston lt;

 polobo@

 gt; writes:
 Just to clarify we are still allowing simple aliasing:
 
 select * from generate_series(1,2) with ordinality as t(f1,f2); 
 
 Right, that works (and is required by spec, I believe).  It's what to
 do with our column-definition-list extension that's at issue.
 
 Not sure if this is possible at this point but really the alias for the
 ordinality column would be attached directly to the ordinality keyword.
 
 e.g., ...) with ordinality{alias} as t(a1, a2)
 
 This has no support in the standard.

Now I'm just spinning some thoughts:

) with ordinality AS t(a1 text, a2 text | ord1)  -- type-less, but a
different separator

) with ordinality AS t(a1 text, a2 text)(ord1) -- stick it in its own
section, type-less

) with ordinality AS t(a1 text, a2 text) ordinal(ord1) --name the section
too

would probably want to extend the alias syntax to match...

Is there any precedent in other RDBMS to consider?

I don't see any obvious alternatives to the ones you listed and syntax is
really not a huge barrier.  If the implementation of an optionally specified
alias is a barrier then either someone needs to feel strongly enough to
implement it or just default to #1 for the time being.

But others really haven't had a chance to read and respond yet so I'm gonna
get off this train for a while.


David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/WITH-ORDINALITY-versus-column-definition-lists-tp5779443p5779473.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 192 matches

Mail list logo