Re: [HACKERS] Parallel Seq Scan

2015-02-08 Thread Amit Kapila
On Fri, Feb 6, 2015 at 11:04 PM, Robert Haas  wrote:
>
> On Fri, Feb 6, 2015 at 9:43 AM, Amit Kapila 
wrote:
> > Here is the latest patch which fixes reported issues and supported
> > Prepared Statements and Explain Statement for parallel sequential
> > scan.
> >
> > The main purpose is to get the feedback if possible on overall
> > structure/design of code before I goahead.
>
>
> 2. InitiateWorkers() is entirely specific to the concerns of parallel
> sequential scan.  After looking this over, I think there are three
> categories of things that need to be clearly separated.  Some stuff is
> going to be needed for any parallel query; some stuff is going to be
> needed only for parallel scans but will be needed for any type of
> parallel scan, not just parallel sequential scan[1]; some stuff is
> needed for any type of node that returns tuples but not for nodes that
> don't return tuples (e.g. needed for ParallelSeqScan and
> ParallelHashJoin, but not needed for ParallelHash); and some stuff is
> only going to be needed for parallel sequential scan specifically.
> This patch mixes all of those concerns together in a single function.
> That won't do; this needs to be easily extensible to whatever someone
> wants to parallelize next.
>

Master backend shares Targetlist, Qual, Scanrelid, Rangetable, Bind Params,
Info about Scan range (Blocks), Tuple queues, Instrumentation Info
to worker, going by your suggestion, I think we can separate them as below:

1. parallel query - Target list, Qual, Bind Params, Instrumentation Info
2. parallel scan and nodes that returns tuples - scanrelid, range table,
Tuple Queues
3. parallel sequiantial scan specific - Info about Scan range (Blocks)

This is as per current list of things which master backend shares with
worker,
if more things are required, then we can decide in which category it falls
and
add it accordingly.

Is this similar to what you have in mind?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Parallel Seq Scan

2015-02-08 Thread Amit Kapila
On Sat, Feb 7, 2015 at 2:30 AM, Robert Haas  wrote:
>
> On Fri, Feb 6, 2015 at 2:13 PM, Robert Haas  wrote:
> > The complicated part here seems to me to figure out what we need to
> > pass from the parallel leader to the parallel worker to create enough
> > state for quals and projection.  If we want to be able to call
> > ExecScan() without modification, which seems like a good goal, we're
> > going to need a ScanState node, which is going to need to contain
> > valid pointers to (at least) a ProjectionInfo, an ExprContext, and a
> > List of quals.  That in turn is going to require an ExecutorState.
> > Serializing those things directly doesn't seem very practical; what we
> > instead want to do is figure out what we can pass that will allow easy
> > reconstruction of those data structures.  Right now, you're passing
> > the target list, the qual list, the range table, and the params, but
> > the range table doesn't seem to be getting used anywhere.  I wonder if
> > we need it.  If we could get away with just passing the target list
> > and qual list, and params, we'd be doing pretty well, I think.  But
> > I'm not sure exactly what that looks like.
>
> IndexBuildHeapRangeScan shows how to do qual evaluation with
> relatively little setup:
>

I think even to make quals work, we need to do few extra things
like setup paramlist, rangetable.  Also for quals, we need to fix
function id's by calling fix_opfuncids() and do the stuff what
ExecInit*() function does for quals. I think these extra things
will be required in processing of qualification for seq scan.

Then we need to construct projection info from target list (basically
do the stuff what ExecInit*() function does).  After constructing
projectioninfo, we can call ExecProject().

Here we need to take care that functions to collect  instrumentation
information like InstrStartNode(),  InstrStopNode(), InstrCountFiltered1(),
etc. be called at appropriate places, so that we can collect the same for
Explain statement when requested by master backend.

Then finally after sending tuples need to destroy all the execution
state constructed for fetching tuples.

So to make this work, basically we need to do all important work
that executor does in three different phases initialization of
node, execution of node, ending the node.  Ideally, we can make this
work by having code specific to just execution of sequiatial scan,
however it seems to me we again need more such kinds of code
(extracted from core part of executor) to make parallel execution of
other functionalaties like aggregation, partition seq scan, etc.

Another idea is to use Executor level interfaces (like ExecutorStart(),
ExecutorRun(), ExecutorEnd()) for execution rather than using Portal
level interfaces.  I have used Portal level interfaces with the
thought that we can reuse the existing infrastructure of Portal to
make parallel execution of scrollable cursors, but as per my analysis
it is not so easy to support them especially backward scan, absolute/
relative fetch, etc, so Executor level interfaces seems more appealing
to me (something like how Explain statement works (ExplainOnePlan)).
Using Executor level interfaces will have advantage that we can reuse them
for other parallel functionalaties.  In this approach, we need to take
care of constructing relavant structures (with the information passed by
master backend) required for Executor interfaces, but I think these should
be lesser than what we need in previous approach (extract seqscan specific
stuff from executor).


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Table-level log_autovacuum_min_duration

2015-02-08 Thread Naoya Anzai
Thanks for your reply.

>> Feature test
>> 
(snip)
> I thought about using a
> special value like -2 to define the behavior you are mentioning here, 
> aka with "-2" disable custom value and GUC parameter but I don't think
> it is worth adding as that's an ugly 3 line of code of this type:
> if (log_min_duration == -2)
>enforce_log_min = -1;

I disscussed about this use case with my co-workers, who said 
"that code is not good", then we concluded that it was actually 
a rare case. If such a case sometimes happens in fact, I (or someone) 
will suggest a different way from this patch to handle this case.

We know there is a practical workaround. :)

>> Coding review
>> 
>> I think description of log_autovacuum_min_duration in reloptions.c
>> (line:215) should be like other parameters (match with guc.c). So
>> it should be "Sets the minimum execution time above which autovacuum
>> actions will be logged." but not "Log autovacuum execution for
>> given threshold".
>
>What about that then?
>"Minimum execution time above which autovacuum actions will be logged"

That's roughly match. For matching with guc.c, you might be better to 
add "Sets the" to that discription.

>> Architecture review
>> 
>> About the modification of gram.y.
>>
>> I think it is not good that log_min_duration is initialized to
>> zeros in gram.y when "FREEZE" option is set. Because any VACUUM
>> queries never use this parameter. I think log_min_duration always
>> should be initialized to -1.
>
>Looking at this patch this morning, actually I think that my last
>patch as well as your suggestion are both wrong. To put it in short
>words, log_min_duration should be set only if VACOPT_VERBOSE is
>defined in query declaration. So I changed this portion of the patch
>this way.

>And I forgot to change VacuumStmt for the ANALYZE portion in gram.y...
>Patch updated attached.

Sorry, I could not correctly express my opinion to you. I mean 
"log_autovacuum_min_duration" is used only by AutoVacuum, Any VACUUM 
queries (including VACUUM VERBOSE) never use this parameter. So, 
when someone executes Manual Vacuum, "log_min_duration" always should 
be initialized to -1.

ANALYZE should also be the same.

In other words, it is not necessary to initialize "log_min_duration" 
to zero when perform the VACUUM(or ANALYZE) VERBOSE. "VERBOSE" is 
provided only for changing the log level of Manual VACUUM from 
DEBUG2 to INFO.

Regards,
-
Naoya.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RangeType internal use

2015-02-08 Thread Amit Langote
On 09-02-2015 AM 10:21, Tom Lane wrote:
> 
> Meh.  I don't care for that much --- it sounds a lot like deciding that
> your problem is a nail because there is a hammer within reach.  A random
> collection of ranges doesn't seem like a very appropriate representation
> to me; first because there is no simple way to enforce that it partitions
> the key space (no overlaps and no missing portions), and second because it
> provides little purchase for efficient tuple routing algorithms.  The best
> you could possibly hope for is some log-N tree search mechanism, and that
> would require a fair amount of setup beforehand.
> 
> I'd rather insist that range partitioning be done on the basis of an
> origin point and a bin width, which would allow trivial computation of
> which bin number any given key falls in.  This will limit the set of types
> it can be done over, for sure, but not more than depending on built-in
> range types would.
> 

Okay, let me back up a little and think about your suggestion which I do
not seem to understand very well - it raises a few questions for me:
does this mean a partitioning criteria is associated with parent
(partitioned table) rather than each individual partition? I would guess
that bin width is partition interval such that each bin number gives
partition number (of equal-sized consecutively numbered partitions
without gaps). But I don't quite understand what origin point is? Is
that a key literal value from which to begin counting bins and if so, is
it stored in catalog as part of the partitioning rule? Does this affect
the syntax to use when defining partitioned table/partitions like:

CREATE TABLE parent PARTITION BY RANGE ON (col1) EVERY '3 months'::interval;

CREATE TABLE child PARTITION 1 OF parent; or some such?

Or may I ask if is is just an internal representation geared towards
optimizing range partitioning whereas on-disk representation is
something more generalized allowing for other partitioning strategies to
be implemented as well?

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RangeType internal use

2015-02-08 Thread Heikki Linnakangas

On 02/09/2015 03:21 AM, Tom Lane wrote:

Amit Langote  writes:

On 07-02-2015 AM 12:10, Tom Lane wrote:

There is no good reason to assume that a range type exists at all, much
less that it is unique for a subtype.  (Read the CREATE TYPE documentation
if you're unclear as to why not.)  You have not said what you're trying to
accomplish, but this seems like a dead end.



Sorry I didn't mention before about an idea I am trying to implement
with this - it is to serialize range partition bounds as a range type
value per partition key column. The serialized representation of a range
partition bound for a partition then effectively becomes an anyarray of
anyrange:


Meh.  I don't care for that much --- it sounds a lot like deciding that
your problem is a nail because there is a hammer within reach.  A random
collection of ranges doesn't seem like a very appropriate representation
to me; first because there is no simple way to enforce that it partitions
the key space (no overlaps and no missing portions), and second because it
provides little purchase for efficient tuple routing algorithms.  The best
you could possibly hope for is some log-N tree search mechanism, and that
would require a fair amount of setup beforehand.


Building a tree or a sorted array of the min or max bounds of the ranges 
doesn't sound hard. log-N sounds fast enough.



I'd rather insist that range partitioning be done on the basis of an
origin point and a bin width, which would allow trivial computation of
which bin number any given key falls in.  This will limit the set of types
it can be done over, for sure, but not more than depending on built-in
range types would.


That sounds unnecessarily limiting. If there's a good reason to limit it 
to that, then fine, but I don't think it'd be any more difficult to make 
it flexible. Let's wait for the patch and see I guess.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-08 Thread Fujii Masao
On Fri, Feb 6, 2015 at 3:42 AM, Michael Paquier
 wrote:
> Fujii Masao wrote:
>> I wrote
>>> This is an inspiration from lz4 APIs. Wouldn't it be buggy for a
>>> compression algorithm to return a size of 0 bytes as compressed or
>>> decompressed length btw? We could as well make it return a negative
>>> value when a failure occurs if you feel more comfortable with it.
>>
>> I feel that's better. Attached is the updated version of the patch.
>> I changed the pg_lzcompress and pg_lzdecompress so that they return -1
>> when failure happens. Also I applied some cosmetic changes to the patch
>> (e.g., shorten the long name of the newly-added macros).
>> Barring any objection, I will commit this.
>
> I just had a look at your updated version, ran some sanity tests, and
> things look good from me. The new names of the macros at the top of
> tuptoaster.c are clearer as well.

Thanks for the review! Pushed!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench -f and vacuum

2015-02-08 Thread Tatsuo Ishii
>>> Although that might be taking this thread rather far off-topic.
>> Not really sure about that, because the only outstanding objection to
>> this discussion is what happens in the startup stage if you specify -f.
>> Right now vacuum is attempted on the standard tables, which is probably
>> not the right thing in the vast majority of cases.  But if we turn that
>> off, how do we reinstate it for the rare cases that want it?  Personally
>> I would just leave it turned off and be done with it, but if we want to
>> provide some way to re-enable it, this --startup-script=FILE gadget
>> sounds like a pretty decent idea.
> (Catching up with this thread)
> Yes I think that it would be more simple to simply turn off the
> internal VACUUM if -f is specified. For the cases where we still need
> to vacuum the tables pgbench_*, we could simply document to run a
> VACUUM on them.

Agreed. Here is the patch to implement the idea: -f just implies -n.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index ddd11a0..f3d7e90 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -2872,6 +2872,7 @@ main(int argc, char **argv)
 			case 'f':
 benchmarking_option_set = true;
 ttype = 3;
+is_no_vacuum = true;	/* "-f" implies "-n" (no vacuum) */
 filename = pg_strdup(optarg);
 if (process_file(filename) == false || *sql_files[num_files - 1] == NULL)
 	exit(1);
diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml
index 7d203cd..b545ea3 100644
--- a/doc/src/sgml/pgbench.sgml
+++ b/doc/src/sgml/pgbench.sgml
@@ -316,6 +316,10 @@ pgbench  options  dbname
 -N, -S, and -f
 are mutually exclusive.

+   
+Please note that -f implies -n, which means that VACUUM for the standard pgbench tables will not be performed before the bench marking.
+You should run the VACUUM command beforehand if necessary.
+   
   
  
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RangeType internal use

2015-02-08 Thread Tom Lane
Amit Langote  writes:
> On 07-02-2015 AM 12:10, Tom Lane wrote:
>> There is no good reason to assume that a range type exists at all, much
>> less that it is unique for a subtype.  (Read the CREATE TYPE documentation
>> if you're unclear as to why not.)  You have not said what you're trying to
>> accomplish, but this seems like a dead end.

> Sorry I didn't mention before about an idea I am trying to implement
> with this - it is to serialize range partition bounds as a range type
> value per partition key column. The serialized representation of a range
> partition bound for a partition then effectively becomes an anyarray of
> anyrange:

Meh.  I don't care for that much --- it sounds a lot like deciding that
your problem is a nail because there is a hammer within reach.  A random
collection of ranges doesn't seem like a very appropriate representation
to me; first because there is no simple way to enforce that it partitions
the key space (no overlaps and no missing portions), and second because it
provides little purchase for efficient tuple routing algorithms.  The best
you could possibly hope for is some log-N tree search mechanism, and that
would require a fair amount of setup beforehand.

I'd rather insist that range partitioning be done on the basis of an
origin point and a bin width, which would allow trivial computation of
which bin number any given key falls in.  This will limit the set of types
it can be done over, for sure, but not more than depending on built-in
range types would.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2015-02-08 Thread David Steele
On 2/2/15 3:49 PM, David Steele wrote:

> The role-base approach being considered may strike some as a misuse of
> the role system, but to my eye it is syntactically very close to how
> Oracle does auditing prior to 12c.  Say you wanted to audit selects on
> the table hr.employee:
>
> Oracle: AUDIT SELECT ON hr.employee;
> pgaudit: GRANT SELECT ON hr.employee TO audit; (assuming audit is the
> role defined by pgaudit.roles)
>
> Object-based auditing in Oracle would be very easy to migrate to the
> grants needed for pgaudit.  In addition, if an AUDIT command were
> introduced later in core, it would be easy to migrate from pgaudit to
> AUDIT assuming the syntax was similar to grant, which seems plausible.

I decided to take a whack at this and see what I could come up with, starting 
with the code in master at https://github.com/2ndQuadrant/pgaudit.

I modified pgaudit.log to work similarly to Oracle's session-level logging, 
meaning user statements are logged instead of tables which are accessed. 
pgaudit.log still has the various classes of commands and only those commands 
which match one of the classes are logged. Note that the pgaudit.log GUC is 
SUSET but can be set at the cluster, database, or user level.

An example - log all statements that create an object or read data:

   DB: connect user postgres, database postgres
  SQL: set pgaudit.log = 'DEFINITION, READ'
  SQL: create user user1

   DB: connect user user1, database postgres
  SQL: create table account
   (
   id int,
   name text,
   password text,
   description text
   );
AUDIT: SESSION,DEFINITION,CREATE TABLE,TABLE,public.account,

  SQL: select *
 from account;
AUDIT: SESSION,READ,SELECT,,,

  SQL: insert into account (id, name, password, description)
values (1, 'user1', 'HASH1', 'blah, blah');
AUDIT: 

Object auditing is done via the grant system (similar to Oracle object 
auditing), but now there is now a single audit role (defined by the 
pgaudit.role GUC which can also be set at the cluster, database, or user level).

An example - using column-level grants since they are more interesting:

 DB: connect user postgres, database postgres
  SQL: set pgaudit.log = 'NONE'
  SQL: create role audit
  SQL: set pgaudit.role = 'audit'

   DB: connect user user1, database postgres

  SQL: grant select (password)
  on public.account
  to audit;
AUDIT: 

  SQL: select id, name
 from account;
AUDIT: 

  SQL: select password
 from account;
AUDIT: OBJECT,READ,SELECT,TABLE,public.account,

  SQL: grant update (name, password)
  on public.account
  to audit;
AUDIT: 

  SQL: update account
  set description = 'yada, yada';
AUDIT: 

  SQL: update account
  set password = 'HASH2';
AUDIT: OBJECT,WRITE,UPDATE,TABLE,public.account,

Session and object auditing can be used together so a statement that does not 
match on an object may still be session logged depending on the settings.

An example - in this case the pgaudit.log GUC will be set on the user and 
grants persist from the previous example.  Another table is added to show how 
that affects logging:

   DB: connect user postgres, database postgres
  SQL: alter role user1 set pgaudit.log = 'READ,WRITE';
AUDIT: 

   DB: connect user user1, database postgres
  SQL: create table account_role_map
   (
   account_id int,
   role_id int
   );
AUDIT: 

  SQL: grant select
  on public.account_role_map
  to audit;
AUDIT: 

  SQL: select account.password, 
  account_role_map.role_id
 from account
  inner join account_role_map
   on account.id = account_role_map.account_id
AUDIT: SESSION,READ,SELECT,,,
AUDIT: OBJECT,READ,SELECT,TABLE,public.account,
AUDIT: OBJECT,READ,SELECT,TABLE,public.account_role_map,

  SQL: update account
  set description = 'yada, yada';
AUDIT: SESSION,WRITE,UPDATE,,,

  SQL: update account
  set description = 'yada, yada'
where password = 'HASH2';
AUDIT: SESSION,WRITE,UPDATE,,,
AUDIT: OBJECT,WRITE,UPDATE,TABLE,public.account,

That about covers it.  I'd be happy to create a pull request to contribute the 
code back to 2ndQuadrant.  There are some things I'm still planning to do, but 
I think this draft looks promising.  pgaudit.c is attached.

Thoughts and suggestions are welcome.

-- - David Steele da...@pgmasters.net

/*
 * pgaudit/pgaudit.c
 *
 * Copyright © 2014-2015, PostgreSQL Global Development Group
 *
 * Permission to use, copy, modify, and distribute this software and
 * its documentation for any purpose, without fee, and without a
 * written agreement is hereby granted, provided that the above
 * copyright notice and this paragraph and the following two
 * paragraphs appear in all copies.
 *
 * IN NO EVENT SHALL THE AUTHOR BE LIABLE TO ANY PARTY FOR DIRECT,
 * INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING
 * LOST PRO

Re: [HACKERS] RangeType internal use

2015-02-08 Thread Amit Langote
On 07-02-2015 AM 12:10, Tom Lane wrote:
> Amit Langote  writes:
>> I wonder why I cannot find a way to get a range type for a given (sub-)
>> type. I would like to build a RangeType from Datum's of lower and upper
>> bounds. Much like how construct_array() builds an ArrayType from a Datum
>> array of elements given elements' type info.
> 
>> Is there some way I do not seem to know? If not, would it be worthwhile
>> to make something like construct_range() that returns a RangeType given
>> Datum's of lower and upper bounds and subtype info?
> 
> There is no good reason to assume that a range type exists at all, much
> less that it is unique for a subtype.  (Read the CREATE TYPE documentation
> if you're unclear as to why not.)  You have not said what you're trying to
> accomplish, but this seems like a dead end.
> 

We do have a number of built-in range types which we can safely assume
to exist with whatever default behavior they are hard-coded with (which
I guess are not 'ALTER TYPE'-able). Those existing range types cover a
useful set of sub-types which are probably also the potential candidates
to be used for range partitioning.

Now, one may define a new range type for a given sub-type with a
different subopclass, subcollation, subcanonical or subdiff, which is
perhaps the problem you are mentioning. Perhaps not very appropriate a
solution, but how about a rngtypisdefault (for a sub-type) flag?

Sorry I didn't mention before about an idea I am trying to implement
with this - it is to serialize range partition bounds as a range type
value per partition key column. The serialized representation of a range
partition bound for a partition then effectively becomes an anyarray of
anyrange:

+ rangebounds = construct_array(datum,
+   partnatts,
+   ANYRANGEOID,
+   -1,
+   false,
+   'd');

each element of the passed datum array is a range type value of the
corresponding subtype.

Now, range types may not yet be ripe for internal use but the above idea
kind of works modulo the absence of infrastructure I was asking for in OP.

Am I still missing something?

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add LINE: hint when schemaname.typename is a non-existent schema

2015-02-08 Thread Tom Lane
Ryan Kelly  writes:
> On Tue, Feb 3, 2015 at 11:10 AM, Tom Lane  wrote:
>> It's probably not reasonable to add a pstate argument to
>> LookupExplicitNamespace itself, given that namespace.c isn't part
>> of the parser.  But you could imagine adding an interface function
>> in the parser that calls LookupExplicitNamespace and then throws a
>> position-aware error on failure, and then making sure that all schema
>> lookups in the parser go through that.

> While searching for other instances of this problem, I found that using
> a schema that does not exist in conjunction with a collation reported the
> position. That code uses setup_parser_errposition_callback and the
> documentation for that function seems to indicate that the usage of it in my
> newly attached patch are consistent. I do not know, however, if this is the
> cleanest approach or if I should actually create a function like
> validate_explicit_namespace to handle this (and I'm also not sure where
> such a function should live, if I do create it).

Yeah, setup_parser_errposition_callback would work too.  I'm not sure
offhand which I like better.  One thing to keep in mind is that the
callback approach results in adding an error cursor position to *any*
error thrown while the callback is active, not only "schema not found".
There are pluses and minuses to that.  I've seen error cursors attached
to very bizarre internal problems that (more or less by chance) showed up
while the parser was looking up a table name, but weren't really connected
to the table name at all.  OTOH, most of the time you'd just as soon not
be too picky about what conditions you provide a cursor for.

The main place I'd question what you did is the callback placement around
make_oper_cache_key --- that might work, but it seems very bizarre, and
perhaps more than usually prone to the "cursor given for unrelated
problem" issue.  Perhaps better to push it down inside that function
so it surrounds just the namespace lookup call.

Also the diffs in parse_utilcmd.c are very confusing and seem to change
more code than is necessary --- why did that happen?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What exactly is our CRC algorithm?

2015-02-08 Thread Abhijit Menon-Sen
At 2015-02-08 18:46:30 +0200, hlinnakan...@vmware.com wrote:
>
> So I propose to move pg_crc.c to src/common, and move the tables
> from pg_crc_tables.h directly to pg_crc.c. Thoughts?

Sounds fine to me.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add LINE: hint when schemaname.typename is a non-existent schema

2015-02-08 Thread Ryan Kelly
On Tue, Feb 3, 2015 at 11:10 AM, Tom Lane  wrote:
> Ryan Kelly  writes:
>> The attached patch adds a LINE: ... hint when schemaname.typename
>> results in a schema which does not exist. I came across this when a
>> missing comma in a SELECT list resulted in an error without a location
>> in a query a few thousand lines long.
>
> I think this is a good problem to attack, but the proposed patch seems
> a bit narrow and brute-force.  Surely this isn't the only place where
> we're missing an errposition pointer on a "no such schema" error?

I managed to find four code paths that failed to report an error position if
the schema did not exist:
- Schema-qualified types (e.g. select schemaname.typename 'foo')
- Schema-qualified operators (e.g. select 1 operator(schemaname.+) 1)
- Schema-qualified table names in CREATE TABLE (e.g. create table
schemaname.tablename (colname integer))
- Schema-qualified function calls (e.g. select schemaname.funcname())

> It's probably not reasonable to add a pstate argument to
> LookupExplicitNamespace itself, given that namespace.c isn't part
> of the parser.  But you could imagine adding an interface function
> in the parser that calls LookupExplicitNamespace and then throws a
> position-aware error on failure, and then making sure that all schema
> lookups in the parser go through that.

While searching for other instances of this problem, I found that using
a schema that does not exist in conjunction with a collation reported the
position. That code uses setup_parser_errposition_callback and the
documentation for that function seems to indicate that the usage of it in my
newly attached patch are consistent. I do not know, however, if this is the
cleanest approach or if I should actually create a function like
validate_explicit_namespace to handle this (and I'm also not sure where
such a function should live, if I do create it).

-Ryan Kelly


missing_type_schema_hint_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-02-08 Thread Robert Haas
On Sat, Feb 7, 2015 at 10:36 PM, Amit Kapila  wrote:
>> Well, I agree with you, but I'm not really sure what that has to do
>> with the issue at hand.  I mean, if we were to apply Amit's patch,
>> we'd been in a situation where, for a non-parallel heap scan, heapam.c
>> decides the order in which blocks get scanned, but for a parallel heap
>> scan, nodeParallelSeqscan.c makes that decision.
>
> I think other places also decides about the order/way heapam.c has
> to scan, example the order in which rows/pages has to traversed is
> decided at portal/executor layer and the same is passed till heap and
> in case of index, the scanlimits (heap_setscanlimits()) are decided
> outside heapam.c and something similar is done for parallel seq scan.
> In general, the scan is driven by Scandescriptor which is constructed
> at upper level and there are some API's exposed to derive the scan.
> If you are not happy with the current way nodeParallelSeqscan has
> set the scan limits, we can have some form of callback which do the
> required work and this callback can be called from heapam.c.

I thought about a callback, but what's the benefit of doing that vs.
hard-coding it in heapam.c?  If the upper-layer wants to impose a TID
qual or similar then heap_setscanlimits() makes sense, but that's
effectively a filter condition, not a policy decision about the access
pattern.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New CF app deployment

2015-02-08 Thread Robert Haas
On Sun, Feb 8, 2015 at 11:04 AM, Magnus Hagander  wrote:
> Filenames are now shown for attachments, including a direct link to the
> attachment itself.  I've also run a job to populate all old threads.

Thanks!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assessing parallel-safety

2015-02-08 Thread Robert Haas
On Sun, Feb 8, 2015 at 11:31 AM, Noah Misch  wrote:
> On Sat, Feb 07, 2015 at 08:18:55PM -0500, Robert Haas wrote:
>> There are a few problems with this design that I don't immediately
>> know how to solve:
>>
>> 1. I'm concerned that the query-rewrite step could substitute a query
>> that is not parallel-safe for one that is.  The upper Query might
>> still be flagged as safe, and that's all that planner() looks at.
>
> I would look at determining the query's parallel safety early in the planner
> instead; simplify_function() might be a cheap place to check.  Besides
> avoiding rewriter trouble, this allows one to alter parallel safety of a
> function without invalidating Query nodes serialized in the system catalogs.

Thanks, I'll investigate that approach.

>> 2. Interleaving the execution of two parallel queries by firing up two
>> copies of the executor simultaneously can result in leaving parallel
>> mode at the wrong time.
>
> Perhaps the parallel mode state should be a reference count, not a boolean.
> Alternately, as a first cut, just don't attempt parallelism when we're already
> in parallel mode.

I think changing it to a reference count makes sense.  I'll work on that.

>> 3. Any code using SPI has to think hard about whether to pass
>> OPT_CURSOR_NO_PARALLEL.  For example, PL/pgsql doesn't need to pass
>> this flag when caching a plan for a query that will be run to
>> completion each time it's executed.  But it DOES need to pass the flag
>> for a FOR loop over an SQL statement, because the code inside the FOR
>> loop might do parallel-unsafe things while the query is suspended.
>
> That makes sense; the code entering SPI knows best which restrictions it can
> tolerate for the life of a given cursor.  (One can imagine finer-grained rules
> in the future.  If the current function is itself marked parallel-safe, it's
> safe in principle for a FOR-loop SQL statement to use parallelism.)  I do
> recommend inverting the sense of the flag, so unmodified non-core PLs will
> continue to behave as they do today.

Yeah, that's probably a good idea.  Sort of annoying, but playing with
the patch in the OP made it pretty clear that we cannot possibly just
assume parallelism is OK by default.  In the core code, parallelism is
OK in more places than not, but in the PLs it seems to be the reverse.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What exactly is our CRC algorithm?

2015-02-08 Thread Andres Freund
On 2015-02-08 18:46:30 +0200, Heikki Linnakangas wrote:
> So I propose to move pg_crc.c to src/common, and move the tables from
> pg_crc_tables.h directly to pg_crc.c. Thoughts?

+1.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What exactly is our CRC algorithm?

2015-02-08 Thread Heikki Linnakangas

On 01/09/2015 10:32 AM, Abhijit Menon-Sen wrote:

1. The slicing-by-8 patch contains numerous changes:


With this patch, CALC_CRC32C is no longer a pure macro, but includes a 
function call. That means that you can no longer just #include pg_crc.h 
and pg_crc_tables.h in an external program. We made that arrangement 
with two header files in 2012 [1], and added src/port/pg_crc.c which 
just #includes pg_crc_tables.h, so that the backend and frontend 
programs that use libpgport will have just one copy of the tables.


Now that there's some actual code in pg_crc.c, I think we have to just 
give up on being able to get the CRC implementation without libpgport. 
It was a nice thought, but I doubt there are any programs out there that 
would have a problem with that. Anything that wants to read the WAL 
needs xlogreader.c anyway.


But actually, we should now move pg_crc.c to src/common. It was a bit of 
a hack to have it in src/port in the first place, because it has nothing 
to do with portability, but src/common didn't exist back then. Now it does.


So I propose to move pg_crc.c to src/common, and move the tables from 
pg_crc_tables.h directly to pg_crc.c. Thoughts?


[1] 
http://www.postgresql.org/message-id/flat/CAAZKuFaNcf3=YtadkWwr8yHb+1axW2RepmQ2j8a9NNGkV7PN=w...@mail.gmail.com.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assessing parallel-safety

2015-02-08 Thread Noah Misch
On Sat, Feb 07, 2015 at 08:18:55PM -0500, Robert Haas wrote:
> There are a few problems with this design that I don't immediately
> know how to solve:
> 
> 1. I'm concerned that the query-rewrite step could substitute a query
> that is not parallel-safe for one that is.  The upper Query might
> still be flagged as safe, and that's all that planner() looks at.

I would look at determining the query's parallel safety early in the planner
instead; simplify_function() might be a cheap place to check.  Besides
avoiding rewriter trouble, this allows one to alter parallel safety of a
function without invalidating Query nodes serialized in the system catalogs.

> 2. Interleaving the execution of two parallel queries by firing up two
> copies of the executor simultaneously can result in leaving parallel
> mode at the wrong time.

Perhaps the parallel mode state should be a reference count, not a boolean.
Alternately, as a first cut, just don't attempt parallelism when we're already
in parallel mode.

> 3. Any code using SPI has to think hard about whether to pass
> OPT_CURSOR_NO_PARALLEL.  For example, PL/pgsql doesn't need to pass
> this flag when caching a plan for a query that will be run to
> completion each time it's executed.  But it DOES need to pass the flag
> for a FOR loop over an SQL statement, because the code inside the FOR
> loop might do parallel-unsafe things while the query is suspended.

That makes sense; the code entering SPI knows best which restrictions it can
tolerate for the life of a given cursor.  (One can imagine finer-grained rules
in the future.  If the current function is itself marked parallel-safe, it's
safe in principle for a FOR-loop SQL statement to use parallelism.)  I do
recommend inverting the sense of the flag, so unmodified non-core PLs will
continue to behave as they do today.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ecpg array support, or lack thereof

2015-02-08 Thread Michael Meskes
On Wed, Feb 04, 2015 at 10:07:07AM -0500, Tom Lane wrote:
> Michael Meskes would be the authority on that question, so I've cc'd
> him to make sure he notices this thread ...

Thanks Tom.

> > The leaks were in the code that takes a host variable, and converts it 
> > into a string for sending to the server as a query parameter. In 
> > particular, it was in code that deals with arrays. However, the fine 
> > manual says:
> 
> >> SQL-level arrays are not directly supported in ECPG. It is not
> >> possible to simply map an SQL array into a C array host variable.
> >> This will result in undefined behavior. Some workarounds exist,
> >> however.
> 
> > That very clearly says that the code that was fixed is not actually 
> > supported.
> 
> It seems quite possible to me that this manual text is out of date.

It is, at least partly. 

> > 1. In timestamp, interval, and date, the array offset is calculated 
> > incorrectly:

Same for numeric it seems.

> > 2. The constructed array looks like this (for timestamp):
> 
> > array [2005-06-23 11:22:33,2004-06-23 11:22:33]
> 
> > That's bogus. It's sent as a query parameter, not embedded into an SQL 
> > string, so the syntax should be '{...}'.

Right. Now I can imagine that this is due to changes over the years. What 
worries my, though, is that this was fixed for integers, but only for integers, 
not even for short or long variants thereof. No idea how that happened.

> > It would be nice to fix these, and add a test case to cover all kinds of 
> > arrays. Although technically, it works as advertised, because the manual 
> > says that it will result in "undefined behaviour".

:)

At the very least it should cover the different types that actually get code 
copied and pasted.

I'm on it as my time permits.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at gmail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New CF app deployment

2015-02-08 Thread Magnus Hagander
On Sat, Feb 7, 2015 at 1:46 PM, Magnus Hagander  wrote:

> On Sat, Feb 7, 2015 at 1:02 AM, Jeff Janes  wrote:
>
>>
>>>
>>> One thing that would probably *help* is if the list of attachments
>>> mentioned the names of the files that were attached to each message
>>> rather than just noting that they have some kind of attachment.  If
>>> people name their attachments sensibly, then you'll be able to
>>> distinguish parallel-seqscan-v23.patch from
>>> test-case-that-breaks-parallel-seqscan.sql, and that would be nice.
>>>
>>
>> Yes, I was going to request that as well.
>>
>>
> Ok, this is easy enough. There's a field missing in an API call but it
> shouldn't be that hard - I'll add this to the short term todo.
>
>
Filenames are now shown for attachments, including a direct link to the
attachment itself.  I've also run a job to populate all old threads.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Fetch zero result rows when executing a query?

2015-02-08 Thread Marko Tiikkaja

On 2015-02-08 09:56, Shay Rojansky wrote:

More to the point, doesn't max_rows=1 have exactly the same dangers as
LIMIT 1? The two seem to be identical, except that one is expressed in

the

SQL query and the other at the network protocol level.



The planner does not have access to network protocol level? options while
it does know about LIMIT.


That's an internal PostgreSQL matter (which granted, may impact efficiency).
My comment about max_rows being equivalent to LIMIT was meant to address
Marko's
argument that max_rows is dangerous because any row might come out and tests
may pass accidentally (but that holds for LIMIT 1 as well, doesn't it).


The point is that then the user gets to choose the behavior.  LIMIT 1 
without ORDER BY is very explicitly telling the reader of the code 
"there might be more than one row returned by this query, but I'm okay 
with getting only one of them, whichever it is".  And when the  LIMIT 1 
 is *not* there, you get the driver automatically checking your queries 
for sanity.  If the driver always throws away the rows after the first 
one, it's difficult to go to behavior of enforcing that no more than one 
row was returned.


Anyway, like you said somewhere upthread, the interface the driver 
you're working on promises to implement right now can't be changed due 
to backwards compatibility concerns.  But I see new interfaces being 
created all the time, and they all make this same mistake.



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fetch zero result rows when executing a query?

2015-02-08 Thread Shay Rojansky
First a general comment:

> Then the driver writers that need these special API behaviors are
> reasonably expected to contribute to adding them to backend products that
> do not already have them.  The database developers are not going to take
on
> responsibility for the API decisions of others; and features deemed (or
> that in reality are) of marginal usefulness are likely to be omitted -
> intentionally or otherwise - from the official (in this case libpq)
> protocol.

I absolutely agree with you there, I'm not trying to get anybody to
implement
something I need (i.e. fetch 0 rows). This is more of a general discussion
as
to whether that feature *makes sense* to you as a protocol feature (which
doesn't
seem to be the case, as some of you guys want to deprecate the whole
max_rows
thing).

>> I'll clarify just a little... I am indeed talking about the PostgreSQL
>> network protocol, and not about query optimization (with LIMIT or
omitting
>> RETURNING etc.). I am implementing ADO.NET's ExecuteNonQuery
>> through which the user indicates they're not interested in any result
rows
>> whether those exist or not.

> ExecuteNonQuery returns an integer while row-returning queries do not.
> I'd argue that the API states that the user is declaring that the query
> they are executing does not return any actual rows - just a count of
> affected rows - not that they do not care to see what rows are returned.

That's true. IMHO the count of affected rows isn't relevant to this
discussion
so I didn't mention it.

>> For the situation where a user does ExecuteNonQuery but the query returns
>> result rows, the driver can save the needless network transfers. We can
>> definitely say it's the user's fault for providing a query with a
resultset
>> to ExecuteNonQuery, but we *do* have the user's clear intention that no
>> rows be fetched so why not act on it. I agree this isn't a terribly
>> important optimization, the trigger for this question was first and
>> foremost curiosity: it seems strange the protocol allows you to specify
>> max_rows for any value other than 0.

> Yes, it does seem strange and, like Marko said, ideally would be
> deprecated.  The fact that it cannot handle "zero rows" seems like an
> unnecessary limitation and I cannot image that any values other than 0 and
> all would be of practical usage.  In the case of zero returning instead
the
> number of rows would be more useful than simply refusing to return
anything
> so even if something like this is needed the current implementation is
> flawed.

Just to be precise: what is strange to me is that the max_rows feature
exists
but has no 0 value. You and Marko are arguing that the whole feature should
be
deprecated (i.e. always return all rows).

>> Here's a possible believable use-case which doesn't involve user neglect:
>> imagine some server-side function which has side-effects and also returns
>> some rows. In some situations the user is interested in the result rows,
>> but in others they only want the side-effect. The user would probably
have
>> no control over the function, and their only way to *not* transfer the
>> result rows would be with a mechanism such as max_rows.

> Functions always return rows and so should not be executed using
> "ExecuteNonQuery".  In most cases action-oriented functions return a
single
> result-status row so ignoring that row, while likely not advisable, is not
> exactly expensive.

Your description of functions doesn't hold for all functions, this is why I
tried to provide a usecase. It is possible for some function to both have a
side-effect (i.e. modify some table) *and* return a large number of rows. It
may be legitimate for a user to want to have the side-effect but not care
about the rows.  Ignoring one row isn't expensive, ignoring many could be.

> The basic question here becomes - the executor already must generate, in
> memory, all of the rows so is there a way to properly interact with the
> server where you can request the number of rows that were generated but
not
> be obligated to actually pull them down to the client.  This doesn't seem
> like an unreasonable request but assuming that it is not currently
possible
> (of which I have little clue) then the question becomes who cares enough
to
> design and implement such a protocol enhancement.

OK.

>> More to the point, doesn't max_rows=1 have exactly the same dangers as
>> LIMIT 1? The two seem to be identical, except that one is expressed in
the
>> SQL query and the other at the network protocol level.

> The planner does not have access to network protocol level? options while
> it does know about LIMIT.

That's an internal PostgreSQL matter (which granted, may impact efficiency).
My comment about max_rows being equivalent to LIMIT was meant to address
Marko's
argument that max_rows is dangerous because any row might come out and tests
may pass accidentally (but that holds for LIMIT 1 as well, doesn't it).

> Expecting users to use an API without knowle