Re: [HACKERS] Don't allow relative path for copy from file

2012-08-16 Thread Etsuro Fujita
 From: Tom Lane [mailto:t...@sss.pgh.pa.us]

 Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes:
  As described in the reference manual for COPY, we should to check file's
path
  format not to allow relative path.  Please find attached a patch.
 
 The argument for disallowing writing to a relative path is to make it
 harder to accidentally overwrite a database file.  That argument does
 not apply to COPY IN, so I'm not convinced we should impose an
 additional restriction.  It's not out of the question that this would
 break real-world use-cases --- imagine someone whose workflow involves
 copying data files across a network to a directory accessible to the
 server (and quite possibly specified by a relative path) and then doing
 COPY IN.
 
 In any case, this patch is missing documentation updates, specifically
 the paragraph in the COPY reference page that it falsifies.

Agreed.  I'd like to withdraw the patch sent in the earlier post, and propose to
update the documentation in the COPY reference page.  Please find attached a
patch.

Thanks,

Best regards,
Etsuro Fujita


copy_ref_page.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] feature request: auto savepoint for interactive psql when in transaction.

2012-08-16 Thread Daniel Farina
On Mon, Nov 14, 2011 at 3:05 PM, Ross Reedstrom reeds...@rice.edu wrote:
 On Mon, Nov 14, 2011 at 02:45:04PM -0800, Will Leinweber wrote:
 My coworker Dan suggested that some people copy and paste scripts. However
 I feel that that is an orthogonal problem and if there is a very high rate
 of input psql should detect that and turn interactive off. And I
 still strongly feel that on_error_rollback=interactive should be the
 default.

 Hmm, I think that falls under the don't so that, then usecase. I've been
 known to cp the occasional script - I guess the concern here would be not
 seeing failed steps that scrolled off the terminal. (I set my scrollback to
 basically infinity and actaully use it, but then I'm strange that way :-) )

I do this and have done this all the time.  Because emacs.  On the
other hand, I only really do it in line-buffered modes.  I also feel
there is something of a development/production parity that is broken
by this, but then again, so are backslash commands interpreted by
psql, and that has never proven to be a practical problem.  That I know of.

I wouldn't let my particular use case (M-x shell) get in the way of
changing the default if that was the consensus because I think this
would help a lot more people than hurt.  In the discoverability
department, one can not-hack the server error message by having psql
emit its own hint when it receives an error while in a transaction
block (vs auto-commit).  This is knowable via the ReadyForQuery
message, which can tell you idle in transaction.

-- 
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] UNION ALL with WHERE clause does not use Merge Append

2012-08-16 Thread Marti Raudsepp
Hi list,

I have a table with (start_time, end_time) and I'd want to tally up
the number of concurrent connections at any point in time. The Merge
Append plan node introduced in 9.1 would be perfect for this purpose.
It seems to work out fine for the most trivial case, but just when I
add an WHERE clause to any of the UNION subqueries, this stops
working. Tested on 9.1.4 and 9.2beta3

Here's a simplified test case:

create table foo as select generate_series(1,100) i;
create index on foo(i);
vacuum analyze foo;

This works as expected:
EXPLAIN ANALYZE
  select i from foo
UNION ALL
  select i from foo
ORDER BY 1 LIMIT 100;

 Limit  (cost=0.01..3.31 rows=100 width=4) (actual time=0.028..0.078
rows=100 loops=1)
   -  Result  (cost=0.01..65981.61 rows=200 width=4) (actual
time=0.026..0.064 rows=100 loops=1)
 -  Merge Append  (cost=0.01..65981.61 rows=200 width=4)
(actual time=0.026..0.053 rows=100 loops=1)
   Sort Key: public.foo.i
   -  Index Only Scan using foo_i_idx on foo
(cost=0.00..20490.80 rows=100 width=4) (actual time=0.017..0.021
rows=51 loops=1)
 Heap Fetches: 0
   -  Index Only Scan using foo_i_idx on foo
(cost=0.00..20490.80 rows=100 width=4) (actual time=0.007..0.012
rows=50 loops=1)
 Heap Fetches: 0
 Total runtime: 0.106 ms


But once I add even a basic WHERE clause, suddenly it decides that
sorting is the only way:
EXPLAIN ANALYZE
  select i from foo where i is not null
UNION ALL
  select i from foo where i is not null
ORDER BY 1 LIMIT 100;

 Limit  (cost=127250.56..127250.81 rows=100 width=4) (actual
time=1070.799..1070.812 rows=100 loops=1)
   -  Sort  (cost=127250.56..132250.56 rows=200 width=4) (actual
time=1070.798..1070.804 rows=100 loops=1)
 Sort Key: public.foo.i
 Sort Method: top-N heapsort  Memory: 29kB
 -  Result  (cost=0.00..50812.00 rows=200 width=4)
(actual time=0.009..786.806 rows=200 loops=1)
   -  Append  (cost=0.00..50812.00 rows=200 width=4)
(actual time=0.007..512.201 rows=200 loops=1)
 -  Seq Scan on foo  (cost=0.00..15406.00
rows=100 width=4) (actual time=0.007..144.872 rows=100
loops=1)
   Filter: (i IS NOT NULL)
 -  Seq Scan on foo  (cost=0.00..15406.00
rows=100 width=4) (actual time=0.003..139.196 rows=100
loops=1)
   Filter: (i IS NOT NULL)
 Total runtime: 1070.847 ms


Works again when I stuff it in a subquery and put WHERE above the
UNION. But this loses flexibility -- I can't filter the subqueries
with different clauses any more:

EXPLAIN ANALYZE
select * from (
select i from foo
  UNION ALL
select i from foo
) subq where i is not null
ORDER BY 1 LIMIT 100;

 Limit  (cost=0.01..3.56 rows=100 width=4) (actual time=0.033..0.088
rows=100 loops=1)
   -  Result  (cost=0.01..70981.61 rows=200 width=4) (actual
time=0.032..0.071 rows=100 loops=1)
 -  Merge Append  (cost=0.01..70981.61 rows=200 width=4)
(actual time=0.031..0.059 rows=100 loops=1)
   Sort Key: public.foo.i
   -  Index Only Scan using foo_i_idx on foo
(cost=0.00..22990.80 rows=100 width=4) (actual time=0.020..0.025
rows=51 loops=1)
 Index Cond: (i IS NOT NULL)
 Heap Fetches: 0
   -  Index Only Scan using foo_i_idx on foo
(cost=0.00..22990.80 rows=100 width=4) (actual time=0.010..0.014
rows=50 loops=1)
 Index Cond: (i IS NOT NULL)
 Heap Fetches: 0
 Total runtime: 0.115 ms


Is this just a planner shortcoming or a bug? Or is there some
justification for this behavior?

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ToDo: allow to get a number of processed rows by COPY statement

2012-08-16 Thread Pavel Stehule
Hello

here is updated patch

postgres=# copy omega to '/tmp/xxx';
COPY 60
postgres=# do $$ declare r int;
begin
   copy omega from '/tmp/xxx'; get diagnostics r = row_count; raise
notice ' %', r;
end;
$$ language plpgsql;
NOTICE:   60
DO

Regards

Pavel

2012/8/16 Bruce Momjian br...@momjian.us:

 What ever happened to this patch?  I don't see it on any of the
 commit-fests, though someone was asked for it to be added:

 http://archives.postgresql.org/pgsql-hackers/2011-10/msg00381.php

 ---

 On Tue, Oct  4, 2011 at 12:22:19PM +0200, Pavel Stehule wrote:
 Hello

 There is not possible to get a number of processed rows when COPY is
 evaluated via SPI. Client can use a tag, but SPI doesn't use a tag.

 I propose a small change a ProcessUtility to return a processed rows.

 Note: I found a small inconsistency between SPI and Utility interface.
 SPI still use a 4 byte unsign int for storing a number of processed
 rows. Utility use a 8bytes unsign int.

 Motivation:

  postgres=# \sf fx
 CREATE OR REPLACE FUNCTION public.fx(tablename text, filename text)
  RETURNS integer
  LANGUAGE plpgsql
 AS $function$
 declare r int;
 begin
   execute format('COPY %s FROM %s', quote_ident(tablename),
 quote_literal(filename));
   get diagnostics r = row_count;
   return r;
 end;
 $function$

 Regards

 Pavel Stehule

 diff --git a/src/backend/executor/functions.c 
 b/src/backend/executor/functions.c
 index 398bc40..a7c2b8f 100644
 --- a/src/backend/executor/functions.c
 +++ b/src/backend/executor/functions.c
 @@ -600,6 +600,7 @@ postquel_getnext(execution_state *es, 
 SQLFunctionCachePtr fcache)
  es-qd-params,
  false,   /* not top level */
  es-qd-dest,
 +NULL,
  NULL);
   result = true;  /* never stops early */
   }
 diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
 index 688279c..21cabcc 100644
 --- a/src/backend/executor/spi.c
 +++ b/src/backend/executor/spi.c
 @@ -1838,6 +1838,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo 
 paramLI,
   {
   Node   *stmt = (Node *) lfirst(lc2);
   boolcanSetTag;
 + boolisCopyStmt = false;
   DestReceiver *dest;

   _SPI_current-processed = 0;
 @@ -1857,6 +1858,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo 
 paramLI,
   {
   CopyStmt   *cstmt = (CopyStmt *) stmt;

 + isCopyStmt = true;
   if (cstmt-filename == NULL)
   {
   my_res = SPI_ERROR_COPY;
 @@ -1911,16 +1913,23 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo 
 paramLI,
   }
   else
   {
 + uint32 processed;
 +
   ProcessUtility(stmt,
  
 plansource-query_string,
  paramLI,
  false,   /* not 
 top level */
  dest,
 -NULL);
 +NULL,
 +processed);
   /* Update processed if stmt returned tuples 
 */
 +
   if (_SPI_current-tuptable)
   _SPI_current-processed = 
 _SPI_current-tuptable-alloced -
   _SPI_current-tuptable-free;
 + else if (canSetTag  isCopyStmt)
 + _SPI_current-processed = processed;
 +
   res = SPI_OK_UTILITY;
   }

 diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
 index 466727b..1a861ee 100644
 --- a/src/backend/tcop/pquery.c
 +++ b/src/backend/tcop/pquery.c
 @@ -1184,7 +1184,8 @@ PortalRunUtility(Portal portal, Node *utilityStmt, 
 bool isTopLevel,
  portal-portalParams,
  isTopLevel,
  dest,
 -completionTag);
 +completionTag,
 +NULL);

   /* Some utility statements may change context on us */
   MemoryContextSwitchTo(PortalGetHeapMemory(portal));
 diff 

Re: [HACKERS] SP-GiST for ranges based on 2d-mapping and quad-tree

2012-08-16 Thread Heikki Linnakangas

On 09.08.2012 18:42, Alexander Korotkov wrote:

In this revision of patch I tried to handle conditions more generally using
variables minLower, maxLower, minUpper, maxUpper, inclusive and
strictEmpty. However some strategies still contain additional logic.


Thanks, that clarified the code tremendously. The comments I added about 
the geometrical interpretations of the operations earlier seem 
unnecessary now, so removed those.



What is our conclusion about saving previous choice for RANGESTRAT_ADJACENT
strategy?


I think we're going to do what you did in the patch. A more generic 
mechanism for holding private state across consistent calls would be 
nice, but it's not that ugly the way you wrote it.


I committed the patch now, but left out the support for adjacent for 
now. Not because there was necessarily anything wrong with that, but 
because I have limited time for reviewing, and the rest of the patch 
looks ready for commit now. I reworded the comments quite a lot, you 
might want to proofread those to double-check that they're still 
correct. I'll take a look at the adjacent-support next, as a separate patch.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-16 Thread Heikki Linnakangas

On 15.08.2012 11:34, Alexander Korotkov wrote:

On Wed, Aug 15, 2012 at 12:14 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:


Histogram of upper bounds would be both more

accurate and natural for some operators. However, it requires collecting
additional statistics while AFAICS it doesn't liberate us from having
histogram of range lengths.


Hmm, if we collected a histogram of lower bounds and a histogram of upper
bounds, that would be roughly the same amount of data as for the standard
histogram with both bounds in the same histogram.


Ok, we've to decide if we need standard histogram. In some cases it can
be used for more accurate estimation of  and  operators.
But I think it is not so important. So, we can replace standard histogram
with histograms of lower and upper bounds?


Yeah, I think that makes more sense. The lower bound histogram is still 
useful for  and  operators, just not as accurate if there are lots of 
values with the same lower bound but different upper bound.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-16 Thread Heikki Linnakangas

On 15.08.2012 11:34, Alexander Korotkov wrote:

On Wed, Aug 15, 2012 at 12:14 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:


Histogram of upper bounds would be both more

accurate and natural for some operators. However, it requires collecting
additional statistics while AFAICS it doesn't liberate us from having
histogram of range lengths.


Hmm, if we collected a histogram of lower bounds and a histogram of upper
bounds, that would be roughly the same amount of data as for the standard
histogram with both bounds in the same histogram.


Ok, we've to decide if we need standard histogram. In some cases it can
be used for more accurate estimation of  and  operators.
But I think it is not so important. So, we can replace standard histogram
with histograms of lower and upper bounds?


Yeah, I think that makes more sense. The lower bound histogram is still 
useful for  and  operators, just not as accurate if there are lots of 
values with the same lower bound but different upper bound.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cataloguing NOT NULL constraints

2012-08-16 Thread Alvaro Herrera
Excerpts from Kevin Grittner's message of jue ago 02 10:48:02 -0400 2012:
 
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
  
  Don't forget the peculiarities of columns with record types. 
  
 I forgot to include the type creation in the example:
  
 test=# create type a as (a1 int, a2 int);
 CREATE TYPE

Thanks for the example.  After playing with this, I think that a NOT
NULL constraint attached to a column with a composite type is equivalent
to a CHECK (col IS DISTINCT FROM NULL); at least they seem to behave
identically.  Is that what you would expect?

This seems a bit complicated to handle with the way I'm doing things
today; at parse analysis time, when my current code is creating the
check constraint, we don't know anything about the type of the column
IIRC.  Maybe I will have to delay creating the constraint until
execution.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UNION ALL with WHERE clause does not use Merge Append

2012-08-16 Thread Tom Lane
Marti Raudsepp ma...@juffo.org writes:
 Is this just a planner shortcoming or a bug? Or is there some
 justification for this behavior?

Per the comment in is_safe_append_member():

 * It's only safe to pull up the child if its jointree contains exactly
 * one RTE, else the AppendRelInfo data structure breaks. The one base RTE
 * could be buried in several levels of FromExpr, however.
 *
 * Also, the child can't have any WHERE quals because there's no place to
 * put them in an appendrel.  (This is a bit annoying...)

This will probably get fixed someday, but I wouldn't recommend holding
your breath for it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Clean up the #include mess a little.

2012-08-16 Thread Alvaro Herrera
Excerpts from Bruce Momjian's message of mié ago 15 18:30:40 -0400 2012:
 
 On Wed, Sep  7, 2011 at 01:20:17AM +0300, Ants Aasma wrote:
  On Tue, Sep 6, 2011 at 10:18 PM, Alvaro Herrera
  alvhe...@commandprompt.com wrote:
   I wonder what happens if files in the same subdir are grouped in a
   subgraph.  Is that possible?
  
  Possible, and done. Also added possivility to add .c files to the graph,
  coloring by subdir and possibility exclude nodes from the graph. I didn't 
  yet
  bother to clean up the code - to avoid eye damage, don't look at the source.
  
  Bad news is that it doesn't significantly help readability for the all nodes
  case. See all_with_subgraphs.svgz.  It does help for other cases.
  For example parsenodes.h.svgz has the result for
  render_includes.py --select='nodes/parsenodes.h+*-*' --subgraphs
  and execnodes.h.svgz for
  --subgraphs --select='nodes/execnodes.h+*-*'
 
 Should we add this script and instructions to src/tools/pginclude?

Probably not, but maybe the developer FAQ in the wiki?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Underspecified window queries in regression tests

2012-08-16 Thread Bruce Momjian

I have used your notes below to rewrite the Window function SQL manual
section.  As you said, it was very hard to read.  I now understand it
better, having restructured it, and I hope others do too.

After waiting 30 minutes for our developer doc build to refresh, I am
giving up and posting my own URL for the doc changes:

http://momjian.us/tmp/pgsql/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS

Perhaps I need to go back to having my own doc build.

---

On Mon, Oct 17, 2011 at 11:48:38AM +0200, Florian Pflug wrote:
 On Oct17, 2011, at 01:09 , Tom Lane wrote:
  Florian Pflug f...@phlo.org writes:
  ... reading those parts again, I realize the it says When ORDER BY is 
  omitted
  the *default* frame consists ... , and that the second quote is followed
  by a footnote which says
  
   There are options to define the window frame in other ways, but this 
  tutorial
   does not cover them. See Section 4.2.8 for details. [3.5, Window 
  Functions]
  
  So it was just me being thick. Sorry for the noise.
  
  Hmm.  Maybe the use of a footnote there is too subtle, and we should
  instead have that text in-line (probably in parentheses)?  Or we could
  use a note, but that's probably too much emphasis.
 
 Inline and in parentheses sounds fine.
 
 In addition, I think we should reword the explanation in 4.2.8 (The SQL 
 Language
 / SQL Syntax / Value Expressions / Window Functions). Instead of that rather
 long (and IMHO hard to read) paragraph about possible frame clauses and their
 behaviour in the presence or absence of an ORDER BY clause, we should go with
 a more algorithmic explanation I think.
 
 Something along these lines maybe:
 
 --
 .) PARTITION BY splits the rows into disjoint partitions. All further 
 processing
happens only inside a single partition
 
 .) In RANGE mode, ORDER BY then splits each partition into an ordered list of
sub-partitions, each containing rows which the ORDER BY considers to be
equivalent.
 
 .) In ROWS mode, OTOH, each sub-partition contains only a single row. Thus, if
there are rows which are considered to be equivalent by the ORDER BY, the
ordering of the sub-partition isn't fully determined.
 
 .) Each row's frame then consists of some consecutive range of sub-partitions.
 
 .) In RANGE mode, that consecutive range can only start at either the first
sub-partition or the current row's sub-partition, and can only end at 
 either
the current row's sub-partition or the last sub-partitions.
 
 .) In ROWS mode, the consecutive range may additional start n sub-partitions
(or rows, it's the same thing here) before the current row, and may 
 additionally
end m sub-partitions/rows after the current row.
 
 From that, it follows that even with an underspecified sort order, the 
 contents of
 each frame are still fully determined in RANGE mode. The ordering of rows 
 within
 a frame is not determined, though. So overall, in RANGE mode, a query's 
 result is
 only non-deterministic if the window function is sensitive to the ordering of 
 rows
 within a frame.
 
 In ROWS mode, OTOH, the contents each frame themselves are not fully 
 determined,
 so even an ordering agnostic window function may produce non-deterministic 
 results.
 --
 
 If you think that something along these lines would be an improvement, I can 
 try
 to come up with a patch.
 
 best regards,
 Florian Pflug
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HeapTupleHeaderAdvanceLatestRemovedXid doing the wrong thing with multixacts

2012-08-16 Thread Bruce Momjian
On Mon, Oct 17, 2011 at 08:38:18PM +0100, Simon Riggs wrote:
 On Mon, Oct 17, 2011 at 8:03 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Alvaro Herrera alvhe...@alvh.no-ip.org writes:
  I just noticed that HeapTupleHeaderAdvanceLatestRemovedXid is comparing 
  Xmax as a TransactionId without verifying whether it is a multixact or 
  not.  Since they advance separately, this could lead to bogus answers.  
  This probably needs to be fixed.  I didn't look into past releases to see 
  if there's a live released bug here or not.
 
  I think the fix is simply to ignore the Xmax if the HEAP_XMAX_IS_MULTI bit 
  is set.
 
  Additionally I think it should check HEAP_XMAX_INVALID before reading the 
  Xmax at all.
 
  If it's failing to even check XMAX_INVALID, surely it's completely
  broken?  Perhaps it assumes its caller has checked all this?
 
 HeapTupleHeaderAdvanceLatestRemovedXid() is only ever called when
 HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DEAD, which only happens
 when HEAP_XMAX_IS_MULTI is not set.
 
 I'll add an assert to check this and a comment to explain.

Was this completed?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: In docs, change a few cases of not important to unimportant.

2012-08-16 Thread Kevin Grittner
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 On 16.08.2012 17:49, Bruce Momjian wrote:
 Heikki Linnakangas wrote:
 On 16.08.2012 17:36, Bruce Momjian wrote:
 In docs, change a few cases of not important to
 unimportant.

 FWIW, I don't think these changes were an improvement. I find
 not important more readable in those sentences.
 
+1
 
 Really?  All of them?
 
 Yes.. Perhaps not important just feels better to me in general
 than unimportant.
 
Yeah, it's probably subjective, but for me unimportant has
negative, dismissive overtones, while not important does not. 
Sort of like the way unpleasant has a different meaning than not
pleasant.
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HeapTupleHeaderAdvanceLatestRemovedXid doing the wrong thing with multixacts

2012-08-16 Thread Bruce Momjian
On Thu, Aug 16, 2012 at 11:38:14AM -0400, Alvaro Herrera wrote:
 Excerpts from Bruce Momjian's message of jue ago 16 11:24:55 -0400 2012:
  On Mon, Oct 17, 2011 at 08:38:18PM +0100, Simon Riggs wrote:
   On Mon, Oct 17, 2011 at 8:03 PM, Tom Lane t...@sss.pgh.pa.us wrote:
Alvaro Herrera alvhe...@alvh.no-ip.org writes:
I just noticed that HeapTupleHeaderAdvanceLatestRemovedXid is 
comparing Xmax as a TransactionId without verifying whether it is a 
multixact or not.  Since they advance separately, this could lead to 
bogus answers.  This probably needs to be fixed.  I didn't look into 
past releases to see if there's a live released bug here or not.
   
I think the fix is simply to ignore the Xmax if the HEAP_XMAX_IS_MULTI 
bit is set.
   
Additionally I think it should check HEAP_XMAX_INVALID before reading 
the Xmax at all.
   
If it's failing to even check XMAX_INVALID, surely it's completely
broken?  Perhaps it assumes its caller has checked all this?
   
   HeapTupleHeaderAdvanceLatestRemovedXid() is only ever called when
   HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DEAD, which only happens
   when HEAP_XMAX_IS_MULTI is not set.
   
   I'll add an assert to check this and a comment to explain.
  
  Was this completed?
 
 As far as I recall, there are changes related to this in my fklocks
 patch.  I am hoping to have some review happen on it during the upcoming
 commitfest (which presumably means I need to do a merge to newer
 sources.)

I was asking about adding the assert check --- does that need to wait
too?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] The pgrminclude problem

2012-08-16 Thread Peter Geoghegan
I found a tool that Google authored that attempts to solve the same
problems as pgrminclude does in our codebase.  It's called include
what you use, and is based on Clang. The project is hosted here:

http://code.google.com/p/include-what-you-use/

I'm not suggesting that we should start using this tool instead of
pgrminclude, because it has enough caveats of its own, and is mostly
written with Google's C++ codebase in mind. However, it's worth being
aware of. There is a useful analysis of the general problem in the
README - Why IWYU is Difficult (you can probably just skip the
extensive analysis of C++ templates as they relate to the problem
there).

The tool is authored by Craig Silverstein, Google's director of
technology. If he believes that IWYU is a difficult problem, well, it
probably is.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The pgrminclude problem

2012-08-16 Thread Bruce Momjian
On Thu, Aug 16, 2012 at 04:48:34PM +0100, Peter Geoghegan wrote:
 I found a tool that Google authored that attempts to solve the same
 problems as pgrminclude does in our codebase.  It's called include
 what you use, and is based on Clang. The project is hosted here:
 
 http://code.google.com/p/include-what-you-use/
 
 I'm not suggesting that we should start using this tool instead of
 pgrminclude, because it has enough caveats of its own, and is mostly
 written with Google's C++ codebase in mind. However, it's worth being
 aware of. There is a useful analysis of the general problem in the
 README - Why IWYU is Difficult (you can probably just skip the
 extensive analysis of C++ templates as they relate to the problem
 there).
 
 The tool is authored by Craig Silverstein, Google's director of
 technology. If he believes that IWYU is a difficult problem, well, it
 probably is.

Good to know. We only use pgrminclude very five years or so, and Tom
isn't even keen on that.

I added the URL to our src/tools/pginclude/README.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HeapTupleHeaderAdvanceLatestRemovedXid doing the wrong thing with multixacts

2012-08-16 Thread Alvaro Herrera
Excerpts from Bruce Momjian's message of jue ago 16 11:44:48 -0400 2012:
 On Thu, Aug 16, 2012 at 11:38:14AM -0400, Alvaro Herrera wrote:
  Excerpts from Bruce Momjian's message of jue ago 16 11:24:55 -0400 2012:
   On Mon, Oct 17, 2011 at 08:38:18PM +0100, Simon Riggs wrote:
On Mon, Oct 17, 2011 at 8:03 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Alvaro Herrera alvhe...@alvh.no-ip.org writes:
 I just noticed that HeapTupleHeaderAdvanceLatestRemovedXid is 
 comparing Xmax as a TransactionId without verifying whether it is a 
 multixact or not.  Since they advance separately, this could lead to 
 bogus answers.  This probably needs to be fixed.  I didn't look into 
 past releases to see if there's a live released bug here or not.

 I think the fix is simply to ignore the Xmax if the 
 HEAP_XMAX_IS_MULTI bit is set.

 Additionally I think it should check HEAP_XMAX_INVALID before 
 reading the Xmax at all.

 If it's failing to even check XMAX_INVALID, surely it's completely
 broken?  Perhaps it assumes its caller has checked all this?

HeapTupleHeaderAdvanceLatestRemovedXid() is only ever called when
HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DEAD, which only happens
when HEAP_XMAX_IS_MULTI is not set.

I'll add an assert to check this and a comment to explain.
   
   Was this completed?
  
  As far as I recall, there are changes related to this in my fklocks
  patch.  I am hoping to have some review happen on it during the upcoming
  commitfest (which presumably means I need to do a merge to newer
  sources.)
 
 I was asking about adding the assert check --- does that need to wait
 too?

I don't think it's worth fussing about.  Also I don't need any more
merge conflicts than I already have.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cataloguing NOT NULL constraints

2012-08-16 Thread Kevin Grittner
Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 
 I think that a NOT NULL constraint attached to a column with a
 composite type is equivalent to a CHECK (col IS DISTINCT FROM
 NULL); at least they seem to behave identically.  Is that what you
 would expect?
 
I had not thought about that, but now that you point it out I think
that interpretation makes more sense than any other.  In a quick
test they behaved identically for me.
 
 This seems a bit complicated to handle with the way I'm doing
 things today; at parse analysis time, when my current code is
 creating the check constraint, we don't know anything about the
 type of the column IIRC.  Maybe I will have to delay creating the
 constraint until execution.
 
Why?  CHECK (col IS DISTINCT FROM NULL) works correctly for *any*
type, doesn't it?
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: In docs, change a few cases of not important to unimportant.

2012-08-16 Thread Bruce Momjian
On Thu, Aug 16, 2012 at 10:38:09AM -0500, Kevin Grittner wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
  On 16.08.2012 17:49, Bruce Momjian wrote:
  Heikki Linnakangas wrote:
  On 16.08.2012 17:36, Bruce Momjian wrote:
  In docs, change a few cases of not important to
  unimportant.
 
  FWIW, I don't think these changes were an improvement. I find
  not important more readable in those sentences.
  
 +1
  
  Really?  All of them?
  
  Yes.. Perhaps not important just feels better to me in general
  than unimportant.
  
 Yeah, it's probably subjective, but for me unimportant has
 negative, dismissive overtones, while not important does not. 
 Sort of like the way unpleasant has a different meaning than not
 pleasant.

OK, thanks for the feedback.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The pgrminclude problem

2012-08-16 Thread Peter Geoghegan
On 16 August 2012 16:56, Bruce Momjian br...@momjian.us wrote:
 Good to know. We only use pgrminclude very five years or so, and Tom
 isn't even keen on that.

Yeah. Even if this could be made to work well, we'd still have to do
something like get an absolute consensus from all build farm animals,
if we expected to have an absolutely trustworthy list. I don't think
pgrminclude is a bad idea. I just think that it should only be used to
guide the efforts of a human to remove superfluous #includes, which is
how it is used anyway.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HeapTupleHeaderAdvanceLatestRemovedXid doing the wrong thing with multixacts

2012-08-16 Thread Alvaro Herrera
Excerpts from Bruce Momjian's message of jue ago 16 11:24:55 -0400 2012:
 On Mon, Oct 17, 2011 at 08:38:18PM +0100, Simon Riggs wrote:
  On Mon, Oct 17, 2011 at 8:03 PM, Tom Lane t...@sss.pgh.pa.us wrote:
   Alvaro Herrera alvhe...@alvh.no-ip.org writes:
   I just noticed that HeapTupleHeaderAdvanceLatestRemovedXid is comparing 
   Xmax as a TransactionId without verifying whether it is a multixact or 
   not.  Since they advance separately, this could lead to bogus answers.  
   This probably needs to be fixed.  I didn't look into past releases to 
   see if there's a live released bug here or not.
  
   I think the fix is simply to ignore the Xmax if the HEAP_XMAX_IS_MULTI 
   bit is set.
  
   Additionally I think it should check HEAP_XMAX_INVALID before reading 
   the Xmax at all.
  
   If it's failing to even check XMAX_INVALID, surely it's completely
   broken?  Perhaps it assumes its caller has checked all this?
  
  HeapTupleHeaderAdvanceLatestRemovedXid() is only ever called when
  HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DEAD, which only happens
  when HEAP_XMAX_IS_MULTI is not set.
  
  I'll add an assert to check this and a comment to explain.
 
 Was this completed?

As far as I recall, there are changes related to this in my fklocks
patch.  I am hoping to have some review happen on it during the upcoming
commitfest (which presumably means I need to do a merge to newer
sources.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS

2012-08-16 Thread Fabrízio de Royes Mello
2012/8/15 David E. Wheeler da...@justatheory.com

 On Aug 15, 2012, at 11:31 AM, Fabrízio de Royes Mello wrote:

  Is there any reason not to add $subject? Would it be difficult?
 
  Looking to the source code I think this feature isn't hard to
 implement... I'm writing a little path to do that and I'll send soon...

 Cool, thanks!


The attached patch implement this feature:

CREATE SCHEMA [ IF NOT EXISTS ] schema_name [ AUTHORIZATION user_name ] [
schema_element [ ... ] ]
CREATE SCHEMA [ IF NOT EXISTS ] AUTHORIZATION user_name [ schema_element [
... ] ]

Now, PostgreSQL don't trow an error if we use IF NOT EXISTS in CREATE
SCHEMA statement.

So, I don't know the next steps...

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


create_schema_if_not_exists.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS

2012-08-16 Thread Andrew Dunstan


On 08/16/2012 01:36 PM, Fabrízio de Royes Mello wrote:



2012/8/15 David E. Wheeler da...@justatheory.com 
mailto:da...@justatheory.com


On Aug 15, 2012, at 11:31 AM, Fabrízio de Royes Mello wrote:

 Is there any reason not to add $subject? Would it be difficult?

 Looking to the source code I think this feature isn't hard to
implement... I'm writing a little path to do that and I'll send
soon...

Cool, thanks!


The attached patch implement this feature:

CREATE SCHEMA [ IF NOT EXISTS ] schema_name [ AUTHORIZATION user_name 
] [ schema_element [ ... ] ]
CREATE SCHEMA [ IF NOT EXISTS ] AUTHORIZATION user_name [ 
schema_element [ ... ] ]


Now, PostgreSQL don't trow an error if we use IF NOT EXISTS in 
CREATE SCHEMA statement.


So, I don't know the next steps...



Please see 
http://wiki.postgresql.org/wiki/Developer_FAQ#I_have_developed_a_patch.2C_what_next.3F


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS

2012-08-16 Thread David E. Wheeler
On Aug 16, 2012, at 10:36 AM, Fabrízio de Royes Mello wrote:

 The attached patch implement this feature:
 
 CREATE SCHEMA [ IF NOT EXISTS ] schema_name [ AUTHORIZATION user_name ] [ 
 schema_element [ ... ] ]
 CREATE SCHEMA [ IF NOT EXISTS ] AUTHORIZATION user_name [ schema_element [ 
 ... ] ]
 
 Now, PostgreSQL don't trow an error if we use IF NOT EXISTS in CREATE 
 SCHEMA statement.
 
 So, I don't know the next steps...

Awesome, thanks! Please add it to the next CommitFest:

  https://commitfest.postgresql.org/action/commitfest_view?id=15

Best,

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS

2012-08-16 Thread Fabrízio de Royes Mello
2012/8/16 David E. Wheeler da...@justatheory.com

 On Aug 16, 2012, at 10:36 AM, Fabrízio de Royes Mello wrote:

  The attached patch implement this feature:
 
  CREATE SCHEMA [ IF NOT EXISTS ] schema_name [ AUTHORIZATION user_name ]
 [ schema_element [ ... ] ]
  CREATE SCHEMA [ IF NOT EXISTS ] AUTHORIZATION user_name [ schema_element
 [ ... ] ]
 
  Now, PostgreSQL don't trow an error if we use IF NOT EXISTS in CREATE
 SCHEMA statement.
 
  So, I don't know the next steps...

 Awesome, thanks! Please add it to the next CommitFest:

   https://commitfest.postgresql.org/action/commitfest_view?id=15


Patch added to CommitFest:

https://commitfest.postgresql.org/action/patch_view?id=907

Thanks,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] psql \set vs \copy - bug or expected behaviour?

2012-08-16 Thread Bruce Momjian
On Fri, Oct 21, 2011 at 05:31:41PM -0400, Robert Haas wrote:
 On Fri, Oct 21, 2011 at 7:24 AM, Richard Huxton d...@archonet.com wrote:
  It looks like \copy is just passing the text of the query unadjusted to
  COPY. I get a syntax error on :x with the \copy below on both 9.0 and
  9.1
 
  === test script ===
  \set x '''HELLO'''
  -- Works
  \echo :x
  -- Works
  \o '/tmp/test1.txt'
  COPY (SELECT :x) TO STDOUT;
  -- Doesn't work
  \copy (SELECT :x) TO '/tmp/test2.txt'
  === end script ===
 
 I'm not sure whether that's a bug per se, but I can see where a
 behavior change might be an improvement.

I did some research on this and learned a little more about flex rules.

Turns out we can allow variable substitution in psql whole-line
commands, like \copy and \!, by sharing the variable expansion flex
rules with the code that does argument processing.  

What we can't easily do is to allow quotes to prevent variable
substitution in these whole-line commands because we can't process the
quotes because that will remove them.

Here are some examples;  \copy and \! behave the same:

test= \set x abc
test= \echo :x
abc
test= \echo :x
-- :x
test= \! echo :x
abc
test= \! echo :x
-- abc

Notice the last line has expanded :x even though it is in quotes.

So, what do we want?  The attached patch is pretty short.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/bin/psql/psqlscan.l b/src/bin/psql/psqlscan.l
new file mode 100644
index 1208c8f..3732dc5
*** a/src/bin/psql/psqlscan.l
--- b/src/bin/psql/psqlscan.l
*** other			.
*** 934,949 
  
  }
  
! xslasharg{
  	/*
  	 * Default processing of text in a slash command's argument.
  	 *
  	 * Note: unquoted_option_chars counts the number of characters at the
  	 * end of the argument that were not subject to any form of quoting.
  	 * psql_scan_slash_option needs this to strip trailing semicolons safely.
  	 */
  
! {space}|\\	{
  	/*
  	 * Unquoted space is end of arg; do not eat.  Likewise
  	 * backslash is end of command or next command, do not eat
--- 934,956 
  
  }
  
! 
  	/*
  	 * Default processing of text in a slash command's argument.
+ 	 * It shares token actions with xslasharg and xslashwholeline.
  	 *
  	 * Note: unquoted_option_chars counts the number of characters at the
  	 * end of the argument that were not subject to any form of quoting.
  	 * psql_scan_slash_option needs this to strip trailing semicolons safely.
  	 */
  
! xslashwholeline{space}+	{
! 	/* process entire line, but suppress leading whitespace */
! 	if (output_buf-len  0)
! 		ECHO;
! }
! 
! xslasharg{space}|\\	{
  	/*
  	 * Unquoted space is end of arg; do not eat.  Likewise
  	 * backslash is end of command or next command, do not eat
*** other			.
*** 957,982 
  	return LEXRES_OK;
  }
  
! {quote}			{
! 	*option_quote = '\'';
! 	unquoted_option_chars = 0;
! 	BEGIN(xslashquote);
! }
! 
! `{
  	backtick_start_offset = output_buf-len;
  	*option_quote = '`';
  	unquoted_option_chars = 0;
  	BEGIN(xslashbackquote);
  }
  
! {dquote}		{
  	ECHO;
  	*option_quote = '';
  	unquoted_option_chars = 0;
  	BEGIN(xslashdquote);
  }
  
  :{variable_char}+	{
  	/* Possible psql variable substitution */
  	if (option_type == OT_NO_EVAL)
--- 964,1005 
  	return LEXRES_OK;
  }
  
! xslasharg`		{
! /* Only in xslasharg, so backticks are potentially passed to the shell */
  	backtick_start_offset = output_buf-len;
  	*option_quote = '`';
  	unquoted_option_chars = 0;
  	BEGIN(xslashbackquote);
  }
  
! xslasharg{quote}			{
! 	*option_quote = '\'';
! 	unquoted_option_chars = 0;
! 	BEGIN(xslashquote);
! }
! 
! xslasharg{dquote}		{
  	ECHO;
  	*option_quote = '';
  	unquoted_option_chars = 0;
  	BEGIN(xslashdquote);
  }
  
+ xslasharg{other}	{
+ 	unquoted_option_chars++;
+ 	ECHO;
+ }
+ 
+ xslashwholeline{other}	{ ECHO; }
+ 
+ 	/*
+ 	 *	This code allows variable processing in slasharg and wholeline
+ 	 *	modes.  wholeline does not allow quoting to prevent variable
+ 	 *	subtitution because quote detection would remove the quotes.
+ 	 */
+  
+ xslasharg,xslashwholeline{
+ 
  :{variable_char}+	{
  	/* Possible psql variable substitution */
  	if (option_type == OT_NO_EVAL)
*** other			.
*** 1044,1054 
  	ECHO;
  }
  
- {other}			{
- 	unquoted_option_chars++;
- 	ECHO;
- }
- 
  }
  
  xslashquote{
--- 1067,1072 
*** other			.
*** 1115,1133 
  
  }
  
- xslashwholeline{
- 	/* copy everything until end of input line */
- 	/* but suppress leading whitespace */
- 
- {space}+		{
- 	if (output_buf-len  0)
- 		

Re: [HACKERS] TRUE/FALSE vs true/false

2012-08-16 Thread Bruce Momjian
On Tue, Aug 14, 2012 at 10:57:08PM -0400, Peter Eisentraut wrote:
 On Tue, 2012-08-14 at 17:36 -0400, Bruce Momjian wrote:
  On Tue, Aug 14, 2012 at 05:34:02PM -0400, Tom Lane wrote:
   Bruce Momjian br...@momjian.us writes:
On Thu, Aug  4, 2011 at 09:00:11PM +0300, Peter Eisentraut wrote:
On tor, 2011-08-04 at 14:44 +0200, Boszormenyi Zoltan wrote:
I meant a mass sed -e 's/TRUE/true/g'  -e 's/FALSE/false/g' run
so all the ~200 occurrences of both TRUE and FALSE get
converted so the whole source tree is consistent.
   
I would be in favor of that.
   
I have implemented this with the patch at:
http://momjian.us/expire/true_diff.txt
   
   Does this really do anything for us that will justify the extra
   back-patching pain it will cause?  I don't see that it's improving
   code readability any.
  
  I think it is more of a consistency issue.  There were multiple people
  who wanted this change.  Of course, some of those people don't backport
  stuff.
 
 I guess you could argue this particular case without end, but I think we
 should be open to these kinds of changes.  They will make future code
 easier to deal with and confuse new developers less (when to use which,
 do they mean different things, etc.).
 
 If back-patching really becomes a problem, we might want to look a
 little deeper into git options.  I think the Linux kernel people do
 these kinds of cleanups more often, so there is probably some better
 support for it.

So what do we want to do with this?  I am a little concerned that we are
sacrificing code clarity for backpatching ease, but I don't do as much
backpatching as Tom.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TRUE/FALSE vs true/false

2012-08-16 Thread Kevin Grittner
Bruce Momjian br...@momjian.us wrote:
 
 So what do we want to do with this?  I am a little concerned that
 we are sacrificing code clarity for backpatching ease, but I don't
 do as much backpatching as Tom.
 
Well, if you back-patched this change, it would eliminate the issue
for Tom, wouldn't it?  Not sure if that's sane; just a thought.
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TRUE/FALSE vs true/false

2012-08-16 Thread Bruce Momjian
On Thu, Aug 16, 2012 at 02:21:12PM -0500, Kevin Grittner wrote:
 Bruce Momjian br...@momjian.us wrote:
  
  So what do we want to do with this?  I am a little concerned that
  we are sacrificing code clarity for backpatching ease, but I don't
  do as much backpatching as Tom.
  
 Well, if you back-patched this change, it would eliminate the issue
 for Tom, wouldn't it?  Not sure if that's sane; just a thought.

I would be worried about some instability in backpatching.  I was
looking for an 'ignore-case' mode to patch, but I don't see it.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planner avoidance of index only scans for partial indexes

2012-08-16 Thread Josh Berkus
On 8/15/12 3:28 PM, Merlin Moncure wrote:
 Aside: the performance gains I'm seeing for IOS are nothing short of
 spectacular.

Do you have some metrics?  I could use them for publicity stuff.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] State of the on-disk bitmap index

2012-08-16 Thread Daniel Bausch
Hello Jonah, Simon, and the hackers,

I am going to implement a simple kind of encoded bitmap indexes (EBI).
 That is an index type where the bitmap columns may not only contain
only a single '1' in the set of bits belonging to a tuple.  Instead, an
additional mapping table translates the distinct values of the table
column into a unique encoding.  To select for a given value all bitmap
columns must be compared instead of only one.  Queries that match
multiple different values (like IN lists or range queries) simplify to
less than the full set of bitmaps that needs to be compared because of
boolean logic.  The total number of bitmaps required to represent unique
encodings for all different values is ceil(ld(n)), where n is the number
of distinct values.  Compared to normal bitmap indexes this solves the
problem of high-cardinality columns.  It is targetet at data warehousing
scenarios with insert only data.

The respective scientific paper can be found at
http://www.dvs.tu-darmstadt.de/publications/pdf/ebi_a4.pdf

I thought, it could be a good idea to base my work on the long proposed
on-disk bitmap index implementation.  Regarding to the wiki, you, Jonah
and Simon, were the last devs that touched this thing.  Unfortunately I
could not find the patch representing your state of that work.  I could
only capture the development history up to Gianni Ciolli  Gabriele
Bartolini from the old pgsql-patches archives.  Other people involved
were Jie Zhang, Gavin Sherry, Heikki Linnakangas, and Leonardo F.  Are
you and the others still interested in getting this into PG?  A rebase
of the most current bitmap index implementation onto master HEAD will be
the first 'byproduct' that I am going to deliver back to you.

1. Is anyone working on this currently?
2. Who has got the most current source code?
3. Is there a git of that or will I need to reconstruct the history from
the patches I collected?

Disclaimer: Although I read and manually processed most of the
executor's code during my last work, I would still call myself a newbie
and therefore will need some assistance most probably.

Regards,
Daniel

-- 
Daniel Bausch
Wissenschaftlicher Mitarbeiter
Technische Universität Darmstadt
Fachbereich Informatik
Fachgebiet Datenbanken und Verteilte Systeme

Hochschulstraße 10
64289 Darmstadt
Germany

Tel.: +49 6151 16 6706
Fax:  +49 6151 16 6229


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unreproducible bug in snapshot import code

2012-08-16 Thread Bruce Momjian
On Fri, Oct 28, 2011 at 11:07:25AM -0400, Gurjeet Singh wrote:
 On Fri, Oct 28, 2011 at 10:11 AM, Bruce Momjian br...@momjian.us wrote:
 
 Gurjeet Singh wrote:
I have tried reproducing the bug starting from 1 and 2 transactions
   before
the one shown in snippet, and I used tab-completion to get the same
screen-output as termonal1.txt and yet it's not reproducible.
  
   I could reproduce it when I typed TAB just after typing set in set
   transaction snapshot.
   As Tom and Alvaro pointed out, the tab-completion issues a query and
 which
   prevents the set transaction snapshot command.
  
 
  Great! That settles it then. Reproducible, but not a bug.
 
 Yes, it is only tabs that query the database for completion that cause
 this.  Should this be documented somehow?  (No idea how.)
 
 
 If we have a doc section on psql's tab-completion, I think this needs to go
 there. A note like:
 
 Trying to tab-complete on psql may send queries to the server, and depending
 on the transaction state, execution of these queries may lead to non-default/
 unexpected behaviour by the queries executed after tab-completion. For 
 example,
 ...

I have added the attached patch to document this limitation.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
new file mode 100644
index ebb0ad4..a47af51
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*** testdb=gt; \set PROMPT1 '%[%033[1;33;40
*** 3272,3278 
  exits and is reloaded when
  applicationpsql/application starts up. Tab-completion is also
  supported, although the completion logic makes no claim to be an
! acronymSQL/acronym parser.  If for some reason you do not like the tab completion, you
  can turn it off by putting this in a file named
  filename.inputrc/filename in your home directory:
  programlisting
--- 3272,3281 
  exits and is reloaded when
  applicationpsql/application starts up. Tab-completion is also
  supported, although the completion logic makes no claim to be an
! acronymSQL/acronym parser.  The queries generated by tab-completion
! can also interfere with other SQL commands, e.g. literalSET
! TRANSACTION ISOLATION LEVEL/.
! If for some reason you do not like the tab completion, you
  can turn it off by putting this in a file named
  filename.inputrc/filename in your home directory:
  programlisting

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Clean up the #include mess a little.

2012-08-16 Thread Bruce Momjian
On Thu, Aug 16, 2012 at 11:15:24AM -0400, Alvaro Herrera wrote:
 Excerpts from Bruce Momjian's message of mié ago 15 18:30:40 -0400 2012:
  
  On Wed, Sep  7, 2011 at 01:20:17AM +0300, Ants Aasma wrote:
   On Tue, Sep 6, 2011 at 10:18 PM, Alvaro Herrera
   alvhe...@commandprompt.com wrote:
I wonder what happens if files in the same subdir are grouped in a
subgraph.  Is that possible?
   
   Possible, and done. Also added possivility to add .c files to the graph,
   coloring by subdir and possibility exclude nodes from the graph. I didn't 
   yet
   bother to clean up the code - to avoid eye damage, don't look at the 
   source.
   
   Bad news is that it doesn't significantly help readability for the all 
   nodes
   case. See all_with_subgraphs.svgz.  It does help for other cases.
   For example parsenodes.h.svgz has the result for
   render_includes.py --select='nodes/parsenodes.h+*-*' --subgraphs
   and execnodes.h.svgz for
   --subgraphs --select='nodes/execnodes.h+*-*'
  
  Should we add this script and instructions to src/tools/pginclude?
 
 Probably not, but maybe the developer FAQ in the wiki?

I just added the script email URL to the pginclude README.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] heap_page_prune comments

2012-08-16 Thread Bruce Momjian
On Wed, Nov  2, 2011 at 12:27:02PM -0400, Robert Haas wrote:
 The following comment - or at least the last sentence thereof -
 appears to be out of date.
 
 /*
  * XXX Should we update the FSM information of this page ?
  *
  * There are two schools of thought here. We may not want to update 
 FSM
  * information so that the page is not used for unrelated
 UPDATEs/INSERTs
  * and any free space in this page will remain available for further
  * UPDATEs in *this* page, thus improving chances for doing HOT 
 updates.
  *
  * But for a large table and where a page does not receive
 further UPDATEs
  * for a long time, we might waste this space by not updating the FSM
  * information. The relation may get extended and fragmented further.
  *
  * One possibility is to leave fillfactor worth of space in this 
 page
  * and update FSM with the remaining space.
  *
  * In any case, the current FSM implementation doesn't accept
  * one-page-at-a-time updates, so this is all academic for now.
  */
 
 The simple fix here is just to delete that last sentence, but does
 anyone think we ought to do change the behavior, now that we have the
 option to do so?

Last sentence removed.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplesort memory usage: grow_memtuples

2012-08-16 Thread Peter Geoghegan
On 27 July 2012 16:39, Jeff Janes jeff.ja...@gmail.com wrote:
 Can you suggest a benchmark that will usefully exercise this patch?

 I think the given sizes below work on most 64 bit machines.

My apologies for not getting around to taking a look at this sooner.

I tested this patch on my x86_64 laptop:

[peter@peterlaptop tests]$ uname -r
3.4.4-4.fc16.x86_64

I have been able to recreate your results with the work_mem setting
you supplied, 16384 (both queries that you suggested are executed
together, for a less sympathetic workload):

[peter@peterlaptop tests]$ cat sortmemgrow_sort.sql
select count(distinct foo) from (select random() as foo from
generate_series(1,524200)) asdf;
select count(distinct foo) from (select random() as foo from
generate_series(1,524300)) asdf;
[peter@peterlaptop tests]$ # For master:
[peter@peterlaptop tests]$ pgbench -f sortmemgrow_sort.sql -T 45 -n
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 45 s
number of transactions actually processed: 45
tps = 0.992526 (including connections establishing)
tps = 0.992633 (excluding connections establishing)
[peter@peterlaptop tests]$ # For patch:
[peter@peterlaptop tests]$ pgbench -f sortmemgrow_sort.sql -T 45 -n
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 45 s
number of transactions actually processed: 66
tps = 1.461739 (including connections establishing)
tps = 1.461911 (excluding connections establishing)

I didn't find trace_sort all that useful for my earlier work on
optimising in memory-sorting (at least when running benchmarks), as
the granularity is too small for simple, relatively inexpensive
queries with sort nodes (or some tuplesort usage, like the datum sort
of your example). Also, the instrumentation can have a Heisenberg
effect. I've avoided using it here. The less expensive query's
executions costs are essentially the same with and without the patch.

This patch works by not doubling the size of state-memtupsize when
available memory is less than half of allowed memory (i.e. we are one
call to grow_memtuples() away from reporting ). Rather, in that
situation, it is set to a size that extrapolates the likely size of
the total amount of memory needed in a way that is quite flawed, but
likely to work well for the pass-by-value Datum case. Now, on the face
of it, this may appear to be a re-hashing of the question of how
paranoid do we need to be about wasting memory in memory-constrained
situations - should we consider anything other than a geometric growth
rate, to be parsimonious with memory at the risk of thrashing?.
However, this is not really the case, because this is only a single,
last-ditch effort to avoid a particularly undesirable eventuality. We
have little to lose and quite a bit to gain. A cheap guestimation
seems reasonable.

I have attached a revision for your consideration, with a few
editorialisations, mostly style-related.

I removed this entirely:

+*  is the new method still follow this?  The last allocation is no
+* longer necessarily a power of 2, but that is not freed.

I did so because, according to aset.c:

 *  Further improvement 12/00: as the code stood, request sizes in the
 *  midrange between small and large were handled very inefficiently,
 *  because any sufficiently large free chunk would be used to satisfy a
 *  request, even if it was much larger than necessary.  This led to more
 *  and more wasted space in allocated chunks over time.  To fix, get rid
 *  of the midrange behavior: we now handle only small power-of-2-size
 *  chunks as chunks.  Anything large is passed off to malloc().  Change
 *  the number of freelists to change the small/large boundary.

So, at the top of grow_memtuples, this remark still holds:

 * and so there will be no unexpected addition to what we ask for.  (The
 * minimum array size established in tuplesort_begin_common is large
 * enough to force palloc to treat it as a separate chunk, so this
 * assumption should be good.  But let's check it.)

It continues to hold because we're still invariably passing off this
request to malloc() (or, in this particular case, realloc())
regardless of the alignment of the request size. Certainly, the extant
code verifies !LACKMEM, and the regression tests pass when this patch
is applied.

However, there is still a danger of LACKMEM. This revision has the
following logic for extrapolating newmemtupsize (This is almost the
same as the original patch):

+   newmemtupsize = (int) floor(oldmemtupsize * allowedMem / 
memNowUsed);

Suppose that the code was:

+   newmemtupsize = (int) ceil(oldmemtupsize * allowedMem / 
memNowUsed);

We'd now have an error with your latter example query because
!LACKMEM, due to the inclusion of a single additional SortTuple. This
is because GetMemoryChunkSpace (and, ultimately,

Re: [HACKERS] Avoiding shutdown checkpoint at failover

2012-08-16 Thread Bruce Momjian
On Thu, Mar  8, 2012 at 08:20:02AM -0500, Robert Haas wrote:
 On Sat, Jan 28, 2012 at 8:57 AM, Simon Riggs si...@2ndquadrant.com wrote:
  On Thu, Jan 26, 2012 at 5:27 AM, Fujii Masao masao.fu...@gmail.com wrote:
 
  One thing I would like to ask is that why you think walreceiver is more
  appropriate for writing XLOG_END_OF_RECOVERY record than startup
  process. I was thinking the opposite, because if we do so, we might be
  able to skip the end-of-recovery checkpoint even in file-based log-shipping
  case.
 
  Right now, WALReceiver has one code path/use case.
 
  Startup has so many, its much harder to know whether we'll screw up one of 
  them.
 
  If we can add it in either place then I choose the simplest, most
  relevant place. If the code is the same, we can move it around later.
 
  Let me write the code and then we can think some more.
 
 Are we still considering trying to do this for 9.2?  Seems it's been
 over a month without a new patch, and it's not entirely clear that we
 know what the design should be.

Did this get completed?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Regression tests fail once XID counter exceeds 2 billion

2012-08-16 Thread Bruce Momjian
On Wed, Nov 16, 2011 at 07:08:27PM -0500, Tom Lane wrote:
 I wrote:
  Simon Riggs si...@2ndquadrant.com writes:
  We need a function called transactionid_current() so a normal user can 
  write
 
  select virtualtransaction
  from pg_locks
  where transactionid = transactionid_current()
 
  and have it just work.
 
  That would solve that one specific use-case.  The reason I suggested
  txid_from_xid is that it could also be used to compare XIDs seen in
  tuples to members of a txid_snapshot, which is not possible now.
 
 BTW, a pgsql-general question just now made me realize that
 txid_from_xid() could have another use-case too.  Right now, there are
 no inequality comparisons on XIDs, which is necessary because XIDs in
 themselves don't have a total order.  However, you could
 
   ORDER BY txid_from_xid(xmin)
 
 and it would work, ie, give you rows in their XID order.  This could be
 useful for finding the latest-modified rows in a table, modulo the fact
 that it would be ordering by transaction start time not commit time.

Added to TODO:

Add function to allow easier transaction id comparisons

http://archives.postgresql.org/pgsql-hackers/2011-11/msg00786.php 

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS

2012-08-16 Thread Dickson S. Guedes
2012/8/16 Fabrízio de Royes Mello fabriziome...@gmail.com:
 The attached patch implement this feature:

 CREATE SCHEMA [ IF NOT EXISTS ] schema_name [ AUTHORIZATION user_name ] [
 schema_element [ ... ] ]
 CREATE SCHEMA [ IF NOT EXISTS ] AUTHORIZATION user_name [ schema_element [
 ... ] ]

 Now, PostgreSQL don't trow an error if we use IF NOT EXISTS in CREATE
 SCHEMA statement.

I started testing this, but I didn't see regression tests for it.
Could you write them?.

Best.
-- 
Dickson S. Guedes
mail/xmpp: gue...@guedesoft.net - skype: guediz
http://guedesoft.net - http://www.postgresql.org.br


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: list API / memory allocations

2012-08-16 Thread Bruce Momjian

Did we make any headway on this?

---

On Sat, Nov 19, 2011 at 12:31:09PM -0500, Stephen Frost wrote:
 Andres,
 
 * Andres Freund (and...@anarazel.de) wrote:
  For that I added new functions/defines which allocate all the needed memory 
  in 
  one hunk:
  list_immutable_make$n(),
  List *list_new_immutable_n(NodeTag t, size_t n);
  List *list_make_n(NodeTag t, size_t n, ...);
 
 A while back, I posted a patch to try and address this same issue.  The
 approach that I took was to always pre-allocate a certain (#defined)
 amount (think it was 5 or 10 elements).  There were a number of places
 that caused problems with that approach because they hacked on the list
 element structures directly (instead of using the macros/functions)-
 you'll want to watch out for those areas in any work on lists.
 
 That patch is here:
 http://archives.postgresql.org/pgsql-hackers/2011-05/msg01213.php
 
 The thread on it might also be informative.
 
 I do like your approach of being able to pass the ultimate size of the
 list in..  Perhaps the two approaches could be merged?  I was able to
 make everything work with my approach, provided all the callers used the
 list API (I did that by making sure the links, etc, actually pointed to
 the right places in the pre-allocated array).  One downside was that the
 size ended up being larger that it might have been in some cases.
 
   Thanks,
 
   Stephen



-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: list API / memory allocations

2012-08-16 Thread Bruce Momjian

Is this a TODO?

---

On Sat, Nov 19, 2011 at 12:31:09PM -0500, Stephen Frost wrote:
 Andres,
 
 * Andres Freund (and...@anarazel.de) wrote:
  For that I added new functions/defines which allocate all the needed memory 
  in 
  one hunk:
  list_immutable_make$n(),
  List *list_new_immutable_n(NodeTag t, size_t n);
  List *list_make_n(NodeTag t, size_t n, ...);
 
 A while back, I posted a patch to try and address this same issue.  The
 approach that I took was to always pre-allocate a certain (#defined)
 amount (think it was 5 or 10 elements).  There were a number of places
 that caused problems with that approach because they hacked on the list
 element structures directly (instead of using the macros/functions)-
 you'll want to watch out for those areas in any work on lists.
 
 That patch is here:
 http://archives.postgresql.org/pgsql-hackers/2011-05/msg01213.php
 
 The thread on it might also be informative.
 
 I do like your approach of being able to pass the ultimate size of the
 list in..  Perhaps the two approaches could be merged?  I was able to
 make everything work with my approach, provided all the callers used the
 list API (I did that by making sure the links, etc, actually pointed to
 the right places in the pre-allocated array).  One downside was that the
 size ended up being larger that it might have been in some cases.
 
   Thanks,
 
   Stephen



-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Not HOT enough

2012-08-16 Thread Bruce Momjian

Did we want to apply this?

---

On Wed, Nov 23, 2011 at 07:33:18PM +, Simon Riggs wrote:
 On Wed, Nov 23, 2011 at 6:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 
  The real question is do we favour HOT cleanup on those small 8 tables,
  or do we favour HOT cleanup of every other table?
 
  No, the real question is why not think a little harder and see if we can
  come up with a solution that doesn't involve making some cases worse to
  make others better.
 
 Slightly modified patch attached.
 
 When we access a shared relation and the page is prunable we re-derive
 the cutoff value using GetOldestXmin.
 
 That code path is rarely taken. In particular, pg_shdepend is only
 accessed during object creation/alter/drop, so this isn't a path we
 can't spare a small amount little extra on, just like the trade-off
 we've taken to make locking faster for DML while making DDL a little
 slower.
 
 If this is still unacceptable, then I'll have to look at reducing
 impact on pg_shdepend from temp tables - which is still a plan.
 
 -- 
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services

 diff --git a/src/backend/access/heap/pruneheap.c 
 b/src/backend/access/heap/pruneheap.c
 index 61f2ce4..5feaedc 100644
 --- a/src/backend/access/heap/pruneheap.c
 +++ b/src/backend/access/heap/pruneheap.c
 @@ -16,13 +16,13 @@
  
  #include access/heapam.h
  #include access/transam.h
 +#include catalog/catalog.h
  #include miscadmin.h
  #include pgstat.h
  #include storage/bufmgr.h
  #include utils/rel.h
  #include utils/tqual.h
  
 -
  /* Working data for heap_page_prune and subroutines */
  typedef struct
  {
 @@ -72,6 +72,7 @@ heap_page_prune_opt(Relation relation, Buffer buffer, 
 TransactionId OldestXmin)
  {
   Pagepage = BufferGetPage(buffer);
   Sizeminfree;
 + Transactionid   CutoffXmin = OldestXmin;
  
   /*
* Let's see if we really need pruning.
 @@ -91,6 +92,18 @@ heap_page_prune_opt(Relation relation, Buffer buffer, 
 TransactionId OldestXmin)
   return;
  
   /*
 +  * We choose to set RecentGlobalXmin only for the current database, 
 which
 +  * means we cannot use it to prune shared relations when reading them 
 with
 +  * MVCC snapshots. By making that choice it allows our snapshots to be
 +  * smaller and faster, and the RecentGlobalXmin will be further forward
 +  * and offer better pruning of non-shared relations. So when accessing a
 +  * shared relation and we see the page is prunable (above) we get an
 +  * OldestXmin across all databases.
 +  */
 + if (IsSharedRelation(relation-rd_id))
 + CutoffXmin = GetOldestXmin(true, true);
 +
 + /*
* We prune when a previous UPDATE failed to find enough space on the 
 page
* for a new tuple version, or when free space falls below the 
 relation's
* fill-factor target (but not less than 10%).
 @@ -124,7 +137,7 @@ heap_page_prune_opt(Relation relation, Buffer buffer, 
 TransactionId OldestXmin)
   
  * needed */
  
   /* OK to prune */
 - (void) heap_page_prune(relation, buffer, OldestXmin, 
 true, ignore);
 + (void) heap_page_prune(relation, buffer, CutoffXmin, 
 true, ignore);
   }
  
   /* And release buffer lock */
 diff --git a/src/backend/storage/ipc/procarray.c 
 b/src/backend/storage/ipc/procarray.c
 index 1a48485..60153f4 100644
 --- a/src/backend/storage/ipc/procarray.c
 +++ b/src/backend/storage/ipc/procarray.c
 @@ -55,6 +55,7 @@
  #include storage/spin.h
  #include utils/builtins.h
  #include utils/snapmgr.h
 +#include utils/tqual.h
  
  
  /* Our shared memory area */
 @@ -1185,6 +1186,8 @@ GetMaxSnapshotSubxidCount(void)
   *   RecentGlobalXmin: the global xmin (oldest TransactionXmin 
 across all
   *   running transactions, except those running LAZY 
 VACUUM).  This is
   *   the same computation done by GetOldestXmin(true, true).
 + *   For MVCC snapshots we examine transactions running only 
 in our
 + *   database, ignoring longer running transactions in other 
 databases.
   *
   * Note: this function should probably not be called with an argument that's
   * not statically allocated (see xip allocation below).
 @@ -1200,9 +1203,12 @@ GetSnapshotData(Snapshot snapshot)
   int count = 0;
   int subcount = 0;
   boolsuboverflowed = false;
 + boolallDbs = false;
  
   Assert(snapshot != NULL);
  
 + allDbs = !IsMVCCSnapshot(snapshot);
 +
   /*
* Allocating space for maxProcs xids is usually overkill; numProcs 

Re: [HACKERS] Avoiding repeated snapshot computation

2012-08-16 Thread Bruce Momjian

Did we ever make a decision on this patch?

---

On Sat, Nov 26, 2011 at 09:22:50PM +0530, Pavan Deolasee wrote:
 On some recent benchmarks and profile data, I saw GetSnapshotData
 figures at the very top or near top. For lesser number of clients, it
 can account for 10-20% of time, but more number of clients I have seen
 it taking up as much as 40% of sample time. Unfortunately, the machine
 of which I was running these tests is currently not available and so I
 don't have the exact numbers. But the observation is almost correct.
 Our recent work on separating the hot members of PGPROC in a separate
 array would definitely reduce data cache misses ans reduce the
 GetSnapshotData time, but it probably still accounts for a large
 enough critical section for a highly contended lock.
 
 I think now that we have reduced the run time of the function itself,
 we should now try to reduce the number of times the function is
 called. Robert proposed a way to reduce the number of calls per
 transaction. I think we can go one more step further and reduce the
 number for across the transactions.
 
 One major problem today could be because the way LWLock works. If the
 lock is currently held in SHARED mode by some backend and some other
 backend now requests it in SHARED mode, it will immediately get it.
 Thats probably the right thing to do because you don't want the reader
 to really wait when the lock is readily available. But in the case of
 GetSnapshotData(), every reader is doing exactly the same thing; they
 are computing a snapshot based on the same shared state and would
 compute exactly the same snapshot (if we ignore the fact that we don't
 include caller's XID in xip array, but thats a minor detail). And
 because the way LWLock works, more and more readers would get in to
 compute the snapshot, until the exclusive waiters get a window to
 sneak in, either because more and more processes slowly start sleeping
 for exclusive access. To depict it, the four transactions make
 overlapping calls for GetSnapshotData() and hence the total critical
 section starts when the first caller enters it and the ends when the
 last caller exits.
 
 Txn1 --[  SHARED]-
 Txn2 [  SHARED]---
 Txn3 -[SHARED]-
 Txn4 ---[   SHARED
 ]-
   |---Total Time 
 |
 
 Couple of ideas come to mind to solve this issue.
 
 A snapshot once computed will remain valid for every call irrespective
 of its origin until at least one transaction ends. So we can store the
 last computed snapshot in some shared area and reuse it for all
 subsequent GetSnapshotData calls. The shared snapshot will get
 invalidated when some transaction ends by calling
 ProcArrayEndTransaction(). I tried this approach and saw a 15%
 improvement for 32-80 clients on the 32 core HP IA box with pgbench -s
 100 -N tests. Not bad, but I think this can be improved further.
 
 What we can do is when a transaction comes to compute its snapshot, it
 checks if some other transaction is already computing a snapshot for
 itself. If so, it just sleeps on the lock. When the other process
 finishes computing the snapshot, it saves the snapshot is a shared
 area and wakes up all processes waiting for the snapshot. All those
 processes then just copy the snapshot from the shared area and they
 are done. This will not only reduce the total CPU consumption by
 avoiding repetitive work, but would also reduce the total time for
 which ProcArrayLock is held in SHARED mode by avoiding pipeline of
 GetSnapshotData calls. I am currently trying the shared work queue
 mechanism to implement this, but I am sure we can do it this in some
 other way too.
 
 Thanks,
 Pavan
 
 -- 
 Pavan Deolasee
 EnterpriseDB     http://www.enterprisedb.com
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large number of open(2) calls with bulk INSERT into empty table

2012-08-16 Thread Bruce Momjian

A TODO for this?

---

On Tue, Dec  6, 2011 at 02:53:42PM -0500, Robert Haas wrote:
 On Tue, Dec 6, 2011 at 7:12 AM, Florian Weimer fwei...@bfk.de wrote:
  * Robert Haas:
 
  I tried whacking out the call to GetPageWithFreeSpace() in
  RelationGetBufferForTuple(), and also with the unpatched code, but the
  run-to-run randomness was way more than any difference the change
  made.  Is there a better test case?
 
  I think that if you want to exercise file system lookup performance, you
  need a larger directory, which presumably means a large number of
  tables.
 
 OK.  I created 100,000 dummy tables, 10,000 at a time avoid blowing up
 the lock manager.  I then repeated my previous tests, and I still
 can't see any meaningful difference (on my MacBook Pro, running MacOS
 X v10.6.8).  So at least on this OS, it doesn't seem to matter much.
 I'm inclined to defer putting any more work into it until such time as
 someone can demonstrate that it actually causes a problem and provides
 a reproducible test case.  I don't deny that there's probably an
 effect and it would be nice to improve this, but it doesn't seem worth
 spending a lot of time on until we can find a case where the effect is
 measurable.
 
 On the other hand, the problem of the FSM taking up 24kB for an 8kB
 table seems clearly worth fixing, but I don't think I have the cycles
 for it at present.  Maybe a TODO is in order.
 
 -- 
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe

2012-08-16 Thread Bruce Momjian

Was this resolved?  (Sorry to be bugging everyone.)

---

On Tue, Nov 29, 2011 at 11:42:10PM -0500, Robert Haas wrote:
 On Tue, Nov 29, 2011 at 11:50 AM, Bruce Momjian br...@momjian.us wrote:
  Just confirming we decided not to persue this.
 
 Doesn't sound like it.
 
 I've been thinking a lot about the more general problem here - namely,
 that allowing catalog changes without an access exclusive lock is
 unsafe - and trying to come up with a good solution, because I'd
 really like to see us make this work.  It strikes me that there are
 really two separate problems here.
 
 1. If you are scanning a system catalog using SnapshotNow, and a
 commit happens at just the right moment, you might see two versions of
 the row or zero rather than one.  You'll see two if you scan the old
 row version, then the concurrent transaction commits, and then you
 scan the new one.  You'll see zero if you scan the new row (ignoring
 it, since it isn't committed yet) and then the concurrent transaction
 commits, and then you scan the old row.  On a related note, if you
 scan several interrelated catalogs (e.g. when building a relcache
 entry) and a transaction that has made matching modifications to
 multiple catalogs commits midway through, you might see the old
 version of the data from one table and the new version of the data
 from some other table.  Using AccessExclusiveLock fixes the problem
 because we guarantee that anybody who wants to read those rows first
 takes some kind of lock, which they won't be able to do while the
 updater holds AccessExclusiveLock.
 
 2. Other backends may have data in the relcache or catcaches which
 won't get invalidated until they do AcceptInvalidationMessages().
 That will always happen no later than the next time they lock the
 relation, so if you are holding AccessExclusiveLock then you should be
 OK: no one else holds any lock, and they'll have to go get one before
 doing anything interesting.  But if you are holding any weaker lock,
 there's nothing to force AcceptInvalidationMessages to happen before
 you reuse those cache entries.
 
 In both cases, name lookups are an exception: we don't know what OID
 to lock until we've done the name lookup.
 
 Random ideas for solving the first problem:
 
 1-A. Take a fresh MVCC snapshot for each catalog scan.  I think we've
 rejected this before because of the cost involved, but maybe with
 Pavan's patch and some of the other improvements that are on the way
 we could make this cheap enough to be acceptable.
 1-B. Don't allow transactions that have modified system catalog X to
 commit while we're scanning system catalog X; make the scans take some
 kind of shared lock and commits an exclusive lock.  This seems awfully
 bad for concurrency, though, not to mention the overhead of taking and
 releasing all those locks even when nothing conflicts.
 1-C. Wrap a retry loop around every place where we scan a system
 catalog.  If a transaction that has written rows in catalog X commits
 while we're scanning catalog X, discard the results of the first scan
 and declare a do-over (or maybe something more coarse-grained, or at
 the other extreme even more fine-grained).  If we're doing several
 interrelated scans then the retry loop must surround the entire unit,
 and enforce a do-over of the whole operation if commits occur in any
 of the catalogs before the scan completes.
 
 Unfortunately, any of these seem likely to require a Lot of Work,
 probably more than can be done in time for 9.2.  However, perhaps it
 would be feasible to hone in on a limited subset of the cases (e.g.
 name lookups, which are mainly done in only a handful of places, and
 represent an existing bug) and implement the necessary code changes
 just for those cases.  Then we could generalize that pattern to other
 cases as time and energy permit.
 
 -- 
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: list API / memory allocations

2012-08-16 Thread Stephen Frost
Bruce,

* Bruce Momjian (br...@momjian.us) wrote:
 Did we make any headway on this?

I've got the code written but I need to test it in a stable environment
to see what kind of improvment it provides.  I've actually been fighting
for the past 2 months with the box that I want to use and think I may
just give up and steal a different server.

It hasn't fallen off my list of things to do, just hasn't been a high
priority.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] RFC: list API / memory allocations

2012-08-16 Thread Bruce Momjian
On Thu, Aug 16, 2012 at 09:51:28PM -0400, Stephen Frost wrote:
 Bruce,
 
 * Bruce Momjian (br...@momjian.us) wrote:
  Did we make any headway on this?
 
 I've got the code written but I need to test it in a stable environment
 to see what kind of improvment it provides.  I've actually been fighting
 for the past 2 months with the box that I want to use and think I may
 just give up and steal a different server.
 
 It hasn't fallen off my list of things to do, just hasn't been a high
 priority.

OK, thanks for the status update.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Timing overhead and Linux clock sources

2012-08-16 Thread Bruce Momjian

FYI, I am planning to go ahead and package this tool in /contrib for PG
9.3.

---

On Fri, Dec  9, 2011 at 08:26:12PM -0500, Greg Smith wrote:
 On 12/09/2011 06:48 PM, Ants Aasma wrote:
 The attached test program (test_gettimeofday_monotonic) shows that one
 test loop iteration takes 34ns with tsc and 1270ns with hpet.
 
 This test program is great, I've wanted this exact sort of
 visibility into this problem for years.  I've toyed with writing
 something like this and then seeing what results it returns on all
 of the build farm animals.  For now I can just run it on all the
 hardware I have access to, which is not a small list.
 
 I think we should bundle this up similarly to test_fsync, document
 some best practices on time sources, and then point the vague
 warning about EXPLAIN overhead toward that.  Then new sources of
 timing overhead can point there too.   Much like low-level fsync
 timing, there's nothing PostgreSQL can do about it, the best we can
 do is provide a program to help users classify a system as likely or
 unlikely to run to have high-overhead timing.  I can make the needed
 docs changes, this is resolving a long standing issue impacting code
 I wanted to add.
 
 Rough guideline I'm seeing shape up is that 50ns is unlikely to
 cause clock timing to be a significant problem, 500ns certainly is,
 and values in the middle should concern people but not necessarily
 invalidate the timing data collected.
 
 I just confirmed that switching the clock source by echoing a new
 value into
 /sys/devices/system/clocksource/clocksource0/current_clocksource
 works (on the 2.6.32 kernel at least).  What we want to see on a
 good server is a result that looks like this, from my Intel i7-860
 system:
 
 $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
 tsc
 $ ./test_gettimeofday_monotonic 5
 Per loop: 39.30 ns
  usec:  count   percent
 4:  6  0.0%
 2:104  0.8%
 1:4999760  3.92983%
 0:  16109 96.07009%
 
 Here's how badly that degrades if I use one of the alternate time sources:
 
 # echo acpi_pm 
 /sys/devices/system/clocksource/clocksource0/current_clocksource
 $ ./test_gettimeofday_monotonic 5
 Per loop: 727.65 ns
  usec:  count   percent
16:  1  0.1%
 8:  0  0.0%
 4:   1233  0.01794%
 2:699  0.01017%
 1:4992622 72.65764%
 0:1876879 27.31423%
 
 echo hpet  /sys/devices/system/clocksource/clocksource0/current_clocksource
 $ ./test_gettimeofday_monotonic 5
 Per loop: 576.96 ns
  usec:  count   percent
 8:  2  0.2%
 4:   1273  0.01469%
 2:767  0.00885%
 1:4993028 57.61598%
 0:3670977 42.36046%
 
 Switching to the Intel T7200 CPU in my T60 laptop only provides the
 poor quality time sources, not TSC, and results show timing is
 really slow there:
 
 $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
 hpet
 $ ./test_gettimeofday_monotonic 5
 Per loop: 1019.60 ns
  usec:  count   percent
   256:  2  0.4%
   128:  3  0.6%
64: 90  0.00184%
32: 23  0.00047%
16: 92  0.00188%
 8:   1246  0.02541%
 4: 34  0.00069%
 2: 136154  2.77645%
 1:4700772 95.85818%
 0:  65466  1.33498%
 
 # echo acpi_pm 
 /sys/devices/system/clocksource/clocksource0/current_clocksource
 $ ./test_gettimeofday_monotonic 5
 Per loop: 1864.66 ns
  usec:  count   percent
   256:  2  0.7%
   128:  0  0.0%
64:  3  0.00011%
32:  6  0.00022%
16: 90  0.00336%
 8:   1741  0.06493%
 4:   2062  0.07690%
 2:2260601 84.30489%
 1: 416954 15.54952%
 0:  0  0.0%
 
 This seems to be measuring exactly the problem I only had a
 hand-wave it's bad on this hardware explanation of before.
 
 -- 
 Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
 PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Slow tab completion w/ lots of tables

2012-08-16 Thread Stephen Frost
Greetings,

  When doing tab-completion under 9.1, pg_table_is_visible(oid) is slow
  and is ending up as the first thing tested against all the rows
  in pg_class.  Increasing the cost of pg_table_is_visible() up to
  10 causes it to move to the end of the tests, which improves things
  greatly- I thought there was a plan to make that the default..?

  This is with 9.1.4.

  After researching this a bit, I'm left wondering why we're using
  substring() to do the matching test.  I don't see any easy way to
  index a substring() call which can be of any size (depending on how
  many characters are preceding the user hitting 'tab').  On the other
  hand, using LIKE with 'string%' and indexing with varchar_pattern_ops
  should give us a nice index which could be used for tab completion,
  right?  If no one points out an obvious flaw in that, I'll take a look
  at making that happen.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] huge tlb support

2012-08-16 Thread David Gould
On Mon, 9 Jul 2012 12:30:23 +0200
Andres Freund and...@2ndquadrant.com wrote:

 On Monday, July 09, 2012 08:11:00 AM Tom Lane wrote:
  y...@mwd.biglobe.ne.jp (YAMAMOTO Takashi) writes:
   Also, I was under the impression that recent Linux kernels use
   hugepages automatically if they can, so I wonder exactly what
   Andres was testing on ...
   
   if you mean the trasparent hugepage feature, iirc it doesn't
   affect MAP_SHARED mappings like this.
  
  Oh!  That would explain some things.  It seems like a pretty nasty
  restriction though ... do you know why they did that?
 Looking a bit deeper they explicitly only work on private memory. The
 reason apparently being that its too hard to update the page table
 entries in multiple processes at once without introducing locking
 problems/scalability issues.
 
 To be sure one can check /proc/$pid_of_pg_proccess/smaps and look for
 the mapping to /dev/zero or the biggest mapping ;). Its not counted as
 Anonymous memory and it doesn't have transparent hugepages. I was
 confused before because there is quite some (400mb here) huge pages
 allocated for postgres during a pgbench run but thats just all the
 local memory...

A warning, on RHEL 6.1 (2.6.32-131.4.1.el6.x86_64 #1 SMP) we have had
horrible problems caused by transparent_hugepages running postgres on
largish systems (128GB to 512GB memory, 32 cores). The system sometimes
goes 99% system time and is very slow and unresponsive to the point of
not successfully completing new tcp connections. Turning off
transparent_hugepages fixes it. 

That said, explicit hugepage support for the buffer cache would be a big
win especially for high connection counts.

-dg


-- 
David Gould   da...@sonic.net
If simplicity worked, the world would be overrun with insects.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] external_pid_file not removed on postmaster exit

2012-08-16 Thread Peter Eisentraut
On Fri, 2012-07-27 at 08:09 +0300, Peter Eisentraut wrote:
 It seems strange that the external_pid_file is never removed.  There is
 even a C comment about it:
 
 /* Should we remove the pid file on postmaster exit? */
 
 I think it should be removed with proc_exit hook just like the main
 postmaster.pid file.
 
 Does anyone remember why this was not done originally or have any
 concerns?

Since that was not the case, I propose the attached patch to unlink the
external pid file.
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index a4fb2a4..73520a6 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -329,6 +329,7 @@
 /*
  * postmaster.c - function prototypes
  */
+static void unlink_external_pid_file(int status, Datum arg);
 static void getInstallationPaths(const char *argv0);
 static void checkDataDir(void);
 static Port *ConnCreate(int serverFd);
@@ -1071,7 +1072,6 @@ static bool save_backend_variables(BackendParameters *param, Port *port,
 		{
 			fprintf(fpidfile, %d\n, MyProcPid);
 			fclose(fpidfile);
-			/* Should we remove the pid file on postmaster exit? */
 
 			/* Make PID file world readable */
 			if (chmod(external_pid_file, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH) != 0)
@@ -1081,6 +1081,8 @@ static bool save_backend_variables(BackendParameters *param, Port *port,
 		else
 			write_stderr(%s: could not write external PID file \%s\: %s\n,
 		 progname, external_pid_file, strerror(errno));
+
+		on_proc_exit(unlink_external_pid_file, 0);
 	}
 
 	/*
@@ -1183,6 +1185,17 @@ static bool save_backend_variables(BackendParameters *param, Port *port,
 
 
 /*
+ * on_proc_exit callback to delete external_pid_file
+ */
+static void
+unlink_external_pid_file(int status, Datum arg)
+{
+	if (external_pid_file)
+		unlink(external_pid_file);
+}
+
+
+/*
  * Compute and check the directory paths to files that are part of the
  * installation (as deduced from the postgres executable's own location)
  */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers