date:20120816

Re: [HACKERS] external_pid_file not removed on postmaster exit

2012-08-16 Thread Peter Eisentraut

On Fri, 2012-07-27 at 08:09 +0300, Peter Eisentraut wrote:
> It seems strange that the external_pid_file is never removed.  There is
> even a C comment about it:
> 
> /* Should we remove the pid file on postmaster exit? */
> 
> I think it should be removed with proc_exit hook just like the main
> postmaster.pid file.
> 
> Does anyone remember why this was not done originally or have any
> concerns?

Since that was not the case, I propose the attached patch to unlink the
external pid file.
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index a4fb2a4..73520a6 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -329,6 +329,7 @@
 /*
  * postmaster.c - function prototypes
  */
+static void unlink_external_pid_file(int status, Datum arg);
 static void getInstallationPaths(const char *argv0);
 static void checkDataDir(void);
 static Port *ConnCreate(int serverFd);
@@ -1071,7 +1072,6 @@ static bool save_backend_variables(BackendParameters *param, Port *port,
 		{
 			fprintf(fpidfile, "%d\n", MyProcPid);
 			fclose(fpidfile);
-			/* Should we remove the pid file on postmaster exit? */
 
 			/* Make PID file world readable */
 			if (chmod(external_pid_file, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH) != 0)
@@ -1081,6 +1081,8 @@ static bool save_backend_variables(BackendParameters *param, Port *port,
 		else
 			write_stderr("%s: could not write external PID file \"%s\": %s\n",
 		 progname, external_pid_file, strerror(errno));
+
+		on_proc_exit(unlink_external_pid_file, 0);
 	}
 
 	/*
@@ -1183,6 +1185,17 @@ static bool save_backend_variables(BackendParameters *param, Port *port,
 
 
 /*
+ * on_proc_exit callback to delete external_pid_file
+ */
+static void
+unlink_external_pid_file(int status, Datum arg)
+{
+	if (external_pid_file)
+		unlink(external_pid_file);
+}
+
+
+/*
  * Compute and check the directory paths to files that are part of the
  * installation (as deduced from the postgres executable's own location)
  */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] huge tlb support

2012-08-16 Thread David Gould

On Mon, 9 Jul 2012 12:30:23 +0200
Andres Freund  wrote:

> On Monday, July 09, 2012 08:11:00 AM Tom Lane wrote:
> > y...@mwd.biglobe.ne.jp (YAMAMOTO Takashi) writes:
> > >> Also, I was under the impression that recent Linux kernels use
> > >> hugepages automatically if they can, so I wonder exactly what
> > >> Andres was testing on ...
> > > 
> > > if you mean the "trasparent hugepage" feature, iirc it doesn't
> > > affect MAP_SHARED mappings like this.
> > 
> > Oh!  That would explain some things.  It seems like a pretty nasty
> > restriction though ... do you know why they did that?
> Looking a bit deeper they explicitly only work on private memory. The
> reason apparently being that its too hard to update the page table
> entries in multiple processes at once without introducing locking
> problems/scalability issues.
> 
> To be sure one can check /proc/$pid_of_pg_proccess/smaps and look for
> the mapping to /dev/zero or the biggest mapping ;). Its not counted as
> Anonymous memory and it doesn't have transparent hugepages. I was
> confused before because there is quite some (400mb here) huge pages
> allocated for postgres during a pgbench run but thats just all the
> local memory...

A warning, on RHEL 6.1 (2.6.32-131.4.1.el6.x86_64 #1 SMP) we have had
horrible problems caused by transparent_hugepages running postgres on
largish systems (128GB to 512GB memory, 32 cores). The system sometimes
goes 99% system time and is very slow and unresponsive to the point of
not successfully completing new tcp connections. Turning off
transparent_hugepages fixes it. 

That said, explicit hugepage support for the buffer cache would be a big
win especially for high connection counts.

-dg


-- 
David Gould   da...@sonic.net
If simplicity worked, the world would be overrun with insects.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Slow tab completion w/ lots of tables

2012-08-16 Thread Stephen Frost

Greetings,

  When doing tab-completion under 9.1, pg_table_is_visible(oid) is slow
  and is ending up as the first thing tested against all the rows
  in pg_class.  Increasing the cost of pg_table_is_visible() up to
  10 causes it to move to the end of the tests, which improves things
  greatly- I thought there was a plan to make that the default..?

  This is with 9.1.4.

  After researching this a bit, I'm left wondering why we're using
  substring() to do the matching test.  I don't see any easy way to
  index a substring() call which can be of any size (depending on how
  many characters are preceding the user hitting 'tab').  On the other
  hand, using LIKE with 'string%' and indexing with varchar_pattern_ops
  should give us a nice index which could be used for tab completion,
  right?  If no one points out an obvious flaw in that, I'll take a look
  at making that happen.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Timing overhead and Linux clock sources

2012-08-16 Thread Bruce Momjian


FYI, I am planning to go ahead and package this tool in /contrib for PG
9.3.

---

On Fri, Dec  9, 2011 at 08:26:12PM -0500, Greg Smith wrote:
> On 12/09/2011 06:48 PM, Ants Aasma wrote:
> >The attached test program (test_gettimeofday_monotonic) shows that one
> >test loop iteration takes 34ns with tsc and 1270ns with hpet.
> 
> This test program is great, I've wanted this exact sort of
> visibility into this problem for years.  I've toyed with writing
> something like this and then seeing what results it returns on all
> of the build farm animals.  For now I can just run it on all the
> hardware I have access to, which is not a small list.
> 
> I think we should bundle this up similarly to test_fsync, document
> some best practices on time sources, and then point the vague
> warning about EXPLAIN overhead toward that.  Then new sources of
> timing overhead can point there too.   Much like low-level fsync
> timing, there's nothing PostgreSQL can do about it, the best we can
> do is provide a program to help users classify a system as likely or
> unlikely to run to have high-overhead timing.  I can make the needed
> docs changes, this is resolving a long standing issue impacting code
> I wanted to add.
> 
> Rough guideline I'm seeing shape up is that <50ns is unlikely to
> cause clock timing to be a significant problem, >500ns certainly is,
> and values in the middle should concern people but not necessarily
> invalidate the timing data collected.
> 
> I just confirmed that switching the clock source by echoing a new
> value into
> /sys/devices/system/clocksource/clocksource0/current_clocksource
> works (on the 2.6.32 kernel at least).  What we want to see on a
> good server is a result that looks like this, from my Intel i7-860
> system:
> 
> $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> tsc
> $ ./test_gettimeofday_monotonic 5
> Per loop: 39.30 ns
>  usec:  count   percent
> 4:  6  0.0%
> 2:104  0.8%
> 1:4999760  3.92983%
> 0:  16109 96.07009%
> 
> Here's how badly that degrades if I use one of the alternate time sources:
> 
> # echo acpi_pm >
> /sys/devices/system/clocksource/clocksource0/current_clocksource
> $ ./test_gettimeofday_monotonic 5
> Per loop: 727.65 ns
>  usec:  count   percent
>16:  1  0.1%
> 8:  0  0.0%
> 4:   1233  0.01794%
> 2:699  0.01017%
> 1:4992622 72.65764%
> 0:1876879 27.31423%
> 
> echo hpet > /sys/devices/system/clocksource/clocksource0/current_clocksource
> $ ./test_gettimeofday_monotonic 5
> Per loop: 576.96 ns
>  usec:  count   percent
> 8:  2  0.2%
> 4:   1273  0.01469%
> 2:767  0.00885%
> 1:4993028 57.61598%
> 0:3670977 42.36046%
> 
> Switching to the Intel T7200 CPU in my T60 laptop only provides the
> poor quality time sources, not TSC, and results show timing is
> really slow there:
> 
> $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> hpet
> $ ./test_gettimeofday_monotonic 5
> Per loop: 1019.60 ns
>  usec:  count   percent
>   256:  2  0.4%
>   128:  3  0.6%
>64: 90  0.00184%
>32: 23  0.00047%
>16: 92  0.00188%
> 8:   1246  0.02541%
> 4: 34  0.00069%
> 2: 136154  2.77645%
> 1:4700772 95.85818%
> 0:  65466  1.33498%
> 
> # echo acpi_pm >
> /sys/devices/system/clocksource/clocksource0/current_clocksource
> $ ./test_gettimeofday_monotonic 5
> Per loop: 1864.66 ns
>  usec:  count   percent
>   256:  2  0.7%
>   128:  0  0.0%
>64:  3  0.00011%
>32:  6  0.00022%
>16: 90  0.00336%
> 8:   1741  0.06493%
> 4:   2062  0.07690%
> 2:2260601 84.30489%
> 1: 416954 15.54952%
> 0:  0  0.0%
> 
> This seems to be measuring exactly the problem I only had a
> hand-wave "it's bad on this hardware" explanation of before.
> 
> -- 
> Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
> PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RFC: list API / memory allocations

2012-08-16 Thread Bruce Momjian

On Thu, Aug 16, 2012 at 09:51:28PM -0400, Stephen Frost wrote:
> Bruce,
> 
> * Bruce Momjian (br...@momjian.us) wrote:
> > Did we make any headway on this?
> 
> I've got the code written but I need to test it in a stable environment
> to see what kind of improvment it provides.  I've actually been fighting
> for the past 2 months with the box that I want to use and think I may
> just give up and steal a different server.
> 
> It hasn't fallen off my list of things to do, just hasn't been a high
> priority.

OK, thanks for the status update.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RFC: list API / memory allocations

2012-08-16 Thread Stephen Frost

Bruce,

* Bruce Momjian (br...@momjian.us) wrote:
> Did we make any headway on this?

I've got the code written but I need to test it in a stable environment
to see what kind of improvment it provides.  I've actually been fighting
for the past 2 months with the box that I want to use and think I may
just give up and steal a different server.

It hasn't fallen off my list of things to do, just hasn't been a high
priority.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe

2012-08-16 Thread Bruce Momjian


Was this resolved?  (Sorry to be bugging everyone.)

---

On Tue, Nov 29, 2011 at 11:42:10PM -0500, Robert Haas wrote:
> On Tue, Nov 29, 2011 at 11:50 AM, Bruce Momjian  wrote:
> > Just confirming we decided not to persue this.
> 
> Doesn't sound like it.
> 
> I've been thinking a lot about the more general problem here - namely,
> that allowing catalog changes without an access exclusive lock is
> unsafe - and trying to come up with a good solution, because I'd
> really like to see us make this work.  It strikes me that there are
> really two separate problems here.
> 
> 1. If you are scanning a system catalog using SnapshotNow, and a
> commit happens at just the right moment, you might see two versions of
> the row or zero rather than one.  You'll see two if you scan the old
> row version, then the concurrent transaction commits, and then you
> scan the new one.  You'll see zero if you scan the new row (ignoring
> it, since it isn't committed yet) and then the concurrent transaction
> commits, and then you scan the old row.  On a related note, if you
> scan several interrelated catalogs (e.g. when building a relcache
> entry) and a transaction that has made matching modifications to
> multiple catalogs commits midway through, you might see the old
> version of the data from one table and the new version of the data
> from some other table.  Using AccessExclusiveLock fixes the problem
> because we guarantee that anybody who wants to read those rows first
> takes some kind of lock, which they won't be able to do while the
> updater holds AccessExclusiveLock.
> 
> 2. Other backends may have data in the relcache or catcaches which
> won't get invalidated until they do AcceptInvalidationMessages().
> That will always happen no later than the next time they lock the
> relation, so if you are holding AccessExclusiveLock then you should be
> OK: no one else holds any lock, and they'll have to go get one before
> doing anything interesting.  But if you are holding any weaker lock,
> there's nothing to force AcceptInvalidationMessages to happen before
> you reuse those cache entries.
> 
> In both cases, name lookups are an exception: we don't know what OID
> to lock until we've done the name lookup.
> 
> Random ideas for solving the first problem:
> 
> 1-A. Take a fresh MVCC snapshot for each catalog scan.  I think we've
> rejected this before because of the cost involved, but maybe with
> Pavan's patch and some of the other improvements that are on the way
> we could make this cheap enough to be acceptable.
> 1-B. Don't allow transactions that have modified system catalog X to
> commit while we're scanning system catalog X; make the scans take some
> kind of shared lock and commits an exclusive lock.  This seems awfully
> bad for concurrency, though, not to mention the overhead of taking and
> releasing all those locks even when nothing conflicts.
> 1-C. Wrap a retry loop around every place where we scan a system
> catalog.  If a transaction that has written rows in catalog X commits
> while we're scanning catalog X, discard the results of the first scan
> and declare a do-over (or maybe something more coarse-grained, or at
> the other extreme even more fine-grained).  If we're doing several
> interrelated scans then the retry loop must surround the entire unit,
> and enforce a do-over of the whole operation if commits occur in any
> of the catalogs before the scan completes.
> 
> Unfortunately, any of these seem likely to require a Lot of Work,
> probably more than can be done in time for 9.2.  However, perhaps it
> would be feasible to hone in on a limited subset of the cases (e.g.
> name lookups, which are mainly done in only a handful of places, and
> represent an existing bug) and implement the necessary code changes
> just for those cases.  Then we could generalize that pattern to other
> cases as time and energy permit.
> 
> -- 
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Large number of open(2) calls with bulk INSERT into empty table

2012-08-16 Thread Bruce Momjian


A TODO for this?

---

On Tue, Dec  6, 2011 at 02:53:42PM -0500, Robert Haas wrote:
> On Tue, Dec 6, 2011 at 7:12 AM, Florian Weimer  wrote:
> > * Robert Haas:
> >
> >> I tried whacking out the call to GetPageWithFreeSpace() in
> >> RelationGetBufferForTuple(), and also with the unpatched code, but the
> >> run-to-run randomness was way more than any difference the change
> >> made.  Is there a better test case?
> >
> > I think that if you want to exercise file system lookup performance, you
> > need a larger directory, which presumably means a large number of
> > tables.
> 
> OK.  I created 100,000 dummy tables, 10,000 at a time avoid blowing up
> the lock manager.  I then repeated my previous tests, and I still
> can't see any meaningful difference (on my MacBook Pro, running MacOS
> X v10.6.8).  So at least on this OS, it doesn't seem to matter much.
> I'm inclined to defer putting any more work into it until such time as
> someone can demonstrate that it actually causes a problem and provides
> a reproducible test case.  I don't deny that there's probably an
> effect and it would be nice to improve this, but it doesn't seem worth
> spending a lot of time on until we can find a case where the effect is
> measurable.
> 
> On the other hand, the problem of the FSM taking up 24kB for an 8kB
> table seems clearly worth fixing, but I don't think I have the cycles
> for it at present.  Maybe a TODO is in order.
> 
> -- 
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Avoiding repeated snapshot computation

2012-08-16 Thread Bruce Momjian


Did we ever make a decision on this patch?

---

On Sat, Nov 26, 2011 at 09:22:50PM +0530, Pavan Deolasee wrote:
> On some recent benchmarks and profile data, I saw GetSnapshotData
> figures at the very top or near top. For lesser number of clients, it
> can account for 10-20% of time, but more number of clients I have seen
> it taking up as much as 40% of sample time. Unfortunately, the machine
> of which I was running these tests is currently not available and so I
> don't have the exact numbers. But the observation is almost correct.
> Our recent work on separating the hot members of PGPROC in a separate
> array would definitely reduce data cache misses ans reduce the
> GetSnapshotData time, but it probably still accounts for a large
> enough critical section for a highly contended lock.
> 
> I think now that we have reduced the run time of the function itself,
> we should now try to reduce the number of times the function is
> called. Robert proposed a way to reduce the number of calls per
> transaction. I think we can go one more step further and reduce the
> number for across the transactions.
> 
> One major problem today could be because the way LWLock works. If the
> lock is currently held in SHARED mode by some backend and some other
> backend now requests it in SHARED mode, it will immediately get it.
> Thats probably the right thing to do because you don't want the reader
> to really wait when the lock is readily available. But in the case of
> GetSnapshotData(), every reader is doing exactly the same thing; they
> are computing a snapshot based on the same shared state and would
> compute exactly the same snapshot (if we ignore the fact that we don't
> include caller's XID in xip array, but thats a minor detail). And
> because the way LWLock works, more and more readers would get in to
> compute the snapshot, until the exclusive waiters get a window to
> sneak in, either because more and more processes slowly start sleeping
> for exclusive access. To depict it, the four transactions make
> overlapping calls for GetSnapshotData() and hence the total critical
> section starts when the first caller enters it and the ends when the
> last caller exits.
> 
> Txn1 --[  SHARED]-
> Txn2 [  SHARED]---
> Txn3 -[SHARED]-
> Txn4 ---[   SHARED
> ]-
>   |<---Total Time 
> >|
> 
> Couple of ideas come to mind to solve this issue.
> 
> A snapshot once computed will remain valid for every call irrespective
> of its origin until at least one transaction ends. So we can store the
> last computed snapshot in some shared area and reuse it for all
> subsequent GetSnapshotData calls. The shared snapshot will get
> invalidated when some transaction ends by calling
> ProcArrayEndTransaction(). I tried this approach and saw a 15%
> improvement for 32-80 clients on the 32 core HP IA box with pgbench -s
> 100 -N tests. Not bad, but I think this can be improved further.
> 
> What we can do is when a transaction comes to compute its snapshot, it
> checks if some other transaction is already computing a snapshot for
> itself. If so, it just sleeps on the lock. When the other process
> finishes computing the snapshot, it saves the snapshot is a shared
> area and wakes up all processes waiting for the snapshot. All those
> processes then just copy the snapshot from the shared area and they
> are done. This will not only reduce the total CPU consumption by
> avoiding repetitive work, but would also reduce the total time for
> which ProcArrayLock is held in SHARED mode by avoiding pipeline of
> GetSnapshotData calls. I am currently trying the shared work queue
> mechanism to implement this, but I am sure we can do it this in some
> other way too.
> 
> Thanks,
> Pavan
> 
> -- 
> Pavan Deolasee
> EnterpriseDB     http://www.enterprisedb.com
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Not HOT enough

2012-08-16 Thread Bruce Momjian


Did we want to apply this?

---

On Wed, Nov 23, 2011 at 07:33:18PM +, Simon Riggs wrote:
> On Wed, Nov 23, 2011 at 6:15 PM, Tom Lane  wrote:
> 
> >> The real question is do we favour HOT cleanup on those small 8 tables,
> >> or do we favour HOT cleanup of every other table?
> >
> > No, the real question is why not think a little harder and see if we can
> > come up with a solution that doesn't involve making some cases worse to
> > make others better.
> 
> Slightly modified patch attached.
> 
> When we access a shared relation and the page is prunable we re-derive
> the cutoff value using GetOldestXmin.
> 
> That code path is rarely taken. In particular, pg_shdepend is only
> accessed during object creation/alter/drop, so this isn't a path we
> can't spare a small amount little extra on, just like the trade-off
> we've taken to make locking faster for DML while making DDL a little
> slower.
> 
> If this is still unacceptable, then I'll have to look at reducing
> impact on pg_shdepend from temp tables - which is still a plan.
> 
> -- 
>  Simon Riggs   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services

> diff --git a/src/backend/access/heap/pruneheap.c 
> b/src/backend/access/heap/pruneheap.c
> index 61f2ce4..5feaedc 100644
> --- a/src/backend/access/heap/pruneheap.c
> +++ b/src/backend/access/heap/pruneheap.c
> @@ -16,13 +16,13 @@
>  
>  #include "access/heapam.h"
>  #include "access/transam.h"
> +#include "catalog/catalog.h"
>  #include "miscadmin.h"
>  #include "pgstat.h"
>  #include "storage/bufmgr.h"
>  #include "utils/rel.h"
>  #include "utils/tqual.h"
>  
> -
>  /* Working data for heap_page_prune and subroutines */
>  typedef struct
>  {
> @@ -72,6 +72,7 @@ heap_page_prune_opt(Relation relation, Buffer buffer, 
> TransactionId OldestXmin)
>  {
>   Pagepage = BufferGetPage(buffer);
>   Sizeminfree;
> + Transactionid   CutoffXmin = OldestXmin;
>  
>   /*
>* Let's see if we really need pruning.
> @@ -91,6 +92,18 @@ heap_page_prune_opt(Relation relation, Buffer buffer, 
> TransactionId OldestXmin)
>   return;
>  
>   /*
> +  * We choose to set RecentGlobalXmin only for the current database, 
> which
> +  * means we cannot use it to prune shared relations when reading them 
> with
> +  * MVCC snapshots. By making that choice it allows our snapshots to be
> +  * smaller and faster, and the RecentGlobalXmin will be further forward
> +  * and offer better pruning of non-shared relations. So when accessing a
> +  * shared relation and we see the page is prunable (above) we get an
> +  * OldestXmin across all databases.
> +  */
> + if (IsSharedRelation(relation->rd_id))
> + CutoffXmin = GetOldestXmin(true, true);
> +
> + /*
>* We prune when a previous UPDATE failed to find enough space on the 
> page
>* for a new tuple version, or when free space falls below the 
> relation's
>* fill-factor target (but not less than 10%).
> @@ -124,7 +137,7 @@ heap_page_prune_opt(Relation relation, Buffer buffer, 
> TransactionId OldestXmin)
>   
>  * needed */
>  
>   /* OK to prune */
> - (void) heap_page_prune(relation, buffer, OldestXmin, 
> true, &ignore);
> + (void) heap_page_prune(relation, buffer, CutoffXmin, 
> true, &ignore);
>   }
>  
>   /* And release buffer lock */
> diff --git a/src/backend/storage/ipc/procarray.c 
> b/src/backend/storage/ipc/procarray.c
> index 1a48485..60153f4 100644
> --- a/src/backend/storage/ipc/procarray.c
> +++ b/src/backend/storage/ipc/procarray.c
> @@ -55,6 +55,7 @@
>  #include "storage/spin.h"
>  #include "utils/builtins.h"
>  #include "utils/snapmgr.h"
> +#include "utils/tqual.h"
>  
>  
>  /* Our shared memory area */
> @@ -1185,6 +1186,8 @@ GetMaxSnapshotSubxidCount(void)
>   *   RecentGlobalXmin: the global xmin (oldest TransactionXmin 
> across all
>   *   running transactions, except those running LAZY 
> VACUUM).  This is
>   *   the same computation done by GetOldestXmin(true, true).
> + *   For MVCC snapshots we examine transactions running only 
> in our
> + *   database, ignoring longer running transactions in other 
> databases.
>   *
>   * Note: this function should probably not be called with an argument that's
>   * not statically allocated (see xip allocation below).
> @@ -1200,9 +1203,12 @@ GetSnapshotData(Snapshot snapshot)
>   int count = 0;
>   int subcount = 0;
>   boolsuboverflowed = false;
> + boolallDbs = false;
>  
>   Assert(snapshot !

Re: [HACKERS] RFC: list API / memory allocations

2012-08-16 Thread Bruce Momjian


Is this a TODO?

---

On Sat, Nov 19, 2011 at 12:31:09PM -0500, Stephen Frost wrote:
> Andres,
> 
> * Andres Freund (and...@anarazel.de) wrote:
> > For that I added new functions/defines which allocate all the needed memory 
> > in 
> > one hunk:
> > list_immutable_make$n(),
> > List *list_new_immutable_n(NodeTag t, size_t n);
> > List *list_make_n(NodeTag t, size_t n, ...);
> 
> A while back, I posted a patch to try and address this same issue.  The
> approach that I took was to always pre-allocate a certain (#defined)
> amount (think it was 5 or 10 elements).  There were a number of places
> that caused problems with that approach because they hacked on the list
> element structures directly (instead of using the macros/functions)-
> you'll want to watch out for those areas in any work on lists.
> 
> That patch is here:
> http://archives.postgresql.org/pgsql-hackers/2011-05/msg01213.php
> 
> The thread on it might also be informative.
> 
> I do like your approach of being able to pass the ultimate size of the
> list in..  Perhaps the two approaches could be merged?  I was able to
> make everything work with my approach, provided all the callers used the
> list API (I did that by making sure the links, etc, actually pointed to
> the right places in the pre-allocated array).  One downside was that the
> size ended up being larger that it might have been in some cases.
> 
>   Thanks,
> 
>   Stephen



-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RFC: list API / memory allocations

2012-08-16 Thread Bruce Momjian


Did we make any headway on this?

---

On Sat, Nov 19, 2011 at 12:31:09PM -0500, Stephen Frost wrote:
> Andres,
> 
> * Andres Freund (and...@anarazel.de) wrote:
> > For that I added new functions/defines which allocate all the needed memory 
> > in 
> > one hunk:
> > list_immutable_make$n(),
> > List *list_new_immutable_n(NodeTag t, size_t n);
> > List *list_make_n(NodeTag t, size_t n, ...);
> 
> A while back, I posted a patch to try and address this same issue.  The
> approach that I took was to always pre-allocate a certain (#defined)
> amount (think it was 5 or 10 elements).  There were a number of places
> that caused problems with that approach because they hacked on the list
> element structures directly (instead of using the macros/functions)-
> you'll want to watch out for those areas in any work on lists.
> 
> That patch is here:
> http://archives.postgresql.org/pgsql-hackers/2011-05/msg01213.php
> 
> The thread on it might also be informative.
> 
> I do like your approach of being able to pass the ultimate size of the
> list in..  Perhaps the two approaches could be merged?  I was able to
> make everything work with my approach, provided all the callers used the
> list API (I did that by making sure the links, etc, actually pointed to
> the right places in the pre-allocated array).  One downside was that the
> size ended up being larger that it might have been in some cases.
> 
>   Thanks,
> 
>   Stephen



-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS

2012-08-16 Thread Dickson S. Guedes

2012/8/16 Fabrízio de Royes Mello :
> The attached patch implement this feature:
>
> CREATE SCHEMA [ IF NOT EXISTS ] schema_name [ AUTHORIZATION user_name ] [
> schema_element [ ... ] ]
> CREATE SCHEMA [ IF NOT EXISTS ] AUTHORIZATION user_name [ schema_element [
> ... ] ]
>
> Now, PostgreSQL don't trow an error if we use "IF NOT EXISTS" in "CREATE
> SCHEMA" statement.

I started testing this, but I didn't see regression tests for it.
Could you write them?.

Best.
-- 
Dickson S. Guedes
mail/xmpp: gue...@guedesoft.net - skype: guediz
http://guedesoft.net - http://www.postgresql.org.br


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Regression tests fail once XID counter exceeds 2 billion

2012-08-16 Thread Bruce Momjian

On Wed, Nov 16, 2011 at 07:08:27PM -0500, Tom Lane wrote:
> I wrote:
> > Simon Riggs  writes:
> >> We need a function called transactionid_current() so a normal user can 
> >> write
> 
> >> select virtualtransaction
> >> from pg_locks
> >> where transactionid = transactionid_current()
> 
> >> and have it "just work".
> 
> > That would solve that one specific use-case.  The reason I suggested
> > txid_from_xid is that it could also be used to compare XIDs seen in
> > tuples to members of a txid_snapshot, which is not possible now.
> 
> BTW, a pgsql-general question just now made me realize that
> txid_from_xid() could have another use-case too.  Right now, there are
> no inequality comparisons on XIDs, which is necessary because XIDs in
> themselves don't have a total order.  However, you could
> 
>   ORDER BY txid_from_xid(xmin)
> 
> and it would work, ie, give you rows in their XID order.  This could be
> useful for finding the latest-modified rows in a table, modulo the fact
> that it would be ordering by transaction start time not commit time.

Added to TODO:

Add function to allow easier transaction id comparisons

http://archives.postgresql.org/pgsql-hackers/2011-11/msg00786.php 

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Avoiding shutdown checkpoint at failover

2012-08-16 Thread Bruce Momjian

On Thu, Mar  8, 2012 at 08:20:02AM -0500, Robert Haas wrote:
> On Sat, Jan 28, 2012 at 8:57 AM, Simon Riggs  wrote:
> > On Thu, Jan 26, 2012 at 5:27 AM, Fujii Masao  wrote:
> >
> >> One thing I would like to ask is that why you think walreceiver is more
> >> appropriate for writing XLOG_END_OF_RECOVERY record than startup
> >> process. I was thinking the opposite, because if we do so, we might be
> >> able to skip the end-of-recovery checkpoint even in file-based log-shipping
> >> case.
> >
> > Right now, WALReceiver has one code path/use case.
> >
> > Startup has so many, its much harder to know whether we'll screw up one of 
> > them.
> >
> > If we can add it in either place then I choose the simplest, most
> > relevant place. If the code is the same, we can move it around later.
> >
> > Let me write the code and then we can think some more.
> 
> Are we still considering trying to do this for 9.2?  Seems it's been
> over a month without a new patch, and it's not entirely clear that we
> know what the design should be.

Did this get completed?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tuplesort memory usage: grow_memtuples

2012-08-16 Thread Peter Geoghegan

On 27 July 2012 16:39, Jeff Janes  wrote:
>> Can you suggest a benchmark that will usefully exercise this patch?
>
> I think the given sizes below work on most 64 bit machines.

My apologies for not getting around to taking a look at this sooner.

I tested this patch on my x86_64 laptop:

[peter@peterlaptop tests]$ uname -r
3.4.4-4.fc16.x86_64

I have been able to recreate your results with the work_mem setting
you supplied, 16384 (both queries that you suggested are executed
together, for a less sympathetic workload):

[peter@peterlaptop tests]$ cat sortmemgrow_sort.sql
select count(distinct foo) from (select random() as foo from
generate_series(1,524200)) asdf;
select count(distinct foo) from (select random() as foo from
generate_series(1,524300)) asdf;
[peter@peterlaptop tests]$ # For master:
[peter@peterlaptop tests]$ pgbench -f sortmemgrow_sort.sql -T 45 -n
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 45 s
number of transactions actually processed: 45
tps = 0.992526 (including connections establishing)
tps = 0.992633 (excluding connections establishing)
[peter@peterlaptop tests]$ # For patch:
[peter@peterlaptop tests]$ pgbench -f sortmemgrow_sort.sql -T 45 -n
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 45 s
number of transactions actually processed: 66
tps = 1.461739 (including connections establishing)
tps = 1.461911 (excluding connections establishing)

I didn't find trace_sort all that useful for my earlier work on
optimising in memory-sorting (at least when running benchmarks), as
the granularity is too small for simple, relatively inexpensive
queries with sort nodes (or some tuplesort usage, like the datum sort
of your example). Also, the instrumentation can have a Heisenberg
effect. I've avoided using it here. The less expensive query's
executions costs are essentially the same with and without the patch.

This patch works by not doubling the size of state->memtupsize when
available memory is less than half of allowed memory (i.e. we are one
call to grow_memtuples() away from reporting ). Rather, in that
situation, it is set to a size that extrapolates the likely size of
the total amount of memory needed in a way that is quite flawed, but
likely to work well for the pass-by-value Datum case. Now, on the face
of it, this may appear to be a re-hashing of the question of "how
paranoid do we need to be about wasting memory in memory-constrained
situations - should we consider anything other than a geometric growth
rate, to be parsimonious with memory at the risk of thrashing?".
However, this is not really the case, because this is only a single,
last-ditch effort to avoid a particularly undesirable eventuality. We
have little to lose and quite a bit to gain. A cheap guestimation
seems reasonable.

I have attached a revision for your consideration, with a few
editorialisations, mostly style-related.

I removed this entirely:

+*  is the new method still follow this?  The last allocation is no
+* longer necessarily a power of 2, but that is not freed.

I did so because, according to aset.c:

 *  Further improvement 12/00: as the code stood, request sizes in the
 *  midrange between "small" and "large" were handled very inefficiently,
 *  because any sufficiently large free chunk would be used to satisfy a
 *  request, even if it was much larger than necessary.  This led to more
 *  and more wasted space in allocated chunks over time.  To fix, get rid
 *  of the midrange behavior: we now handle only "small" power-of-2-size
 *  chunks as chunks.  Anything "large" is passed off to malloc().  Change
 *  the number of freelists to change the small/large boundary.

So, at the top of grow_memtuples, this remark still holds:

 * and so there will be no unexpected addition to what we ask for.  (The
 * minimum array size established in tuplesort_begin_common is large
 * enough to force palloc to treat it as a separate chunk, so this
 * assumption should be good.  But let's check it.)

It continues to hold because we're still invariably passing off this
request to malloc() (or, in this particular case, realloc())
regardless of the alignment of the request size. Certainly, the extant
code verifies !LACKMEM, and the regression tests pass when this patch
is applied.

However, there is still a danger of LACKMEM. This revision has the
following logic for extrapolating newmemtupsize (This is almost the
same as the original patch):

+   newmemtupsize = (int) floor(oldmemtupsize * allowedMem / 
memNowUsed);

Suppose that the code was:

+   newmemtupsize = (int) ceil(oldmemtupsize * allowedMem / 
memNowUsed);

We'd now have an error with your latter example query because
!LACKMEM, due to the inclusion of a single additional SortTuple. This
is because GetMemoryChunkSpace (and, ultimately,
AllocS

Re: [HACKERS] heap_page_prune comments

2012-08-16 Thread Bruce Momjian

On Wed, Nov  2, 2011 at 12:27:02PM -0400, Robert Haas wrote:
> The following comment - or at least the last sentence thereof -
> appears to be out of date.
> 
> /*
>  * XXX Should we update the FSM information of this page ?
>  *
>  * There are two schools of thought here. We may not want to update 
> FSM
>  * information so that the page is not used for unrelated
> UPDATEs/INSERTs
>  * and any free space in this page will remain available for further
>  * UPDATEs in *this* page, thus improving chances for doing HOT 
> updates.
>  *
>  * But for a large table and where a page does not receive
> further UPDATEs
>  * for a long time, we might waste this space by not updating the FSM
>  * information. The relation may get extended and fragmented further.
>  *
>  * One possibility is to leave "fillfactor" worth of space in this 
> page
>  * and update FSM with the remaining space.
>  *
>  * In any case, the current FSM implementation doesn't accept
>  * one-page-at-a-time updates, so this is all academic for now.
>  */
> 
> The simple fix here is just to delete that last sentence, but does
> anyone think we ought to do change the behavior, now that we have the
> option to do so?

Last sentence removed.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Clean up the #include mess a little.

2012-08-16 Thread Bruce Momjian

On Thu, Aug 16, 2012 at 11:15:24AM -0400, Alvaro Herrera wrote:
> Excerpts from Bruce Momjian's message of mié ago 15 18:30:40 -0400 2012:
> > 
> > On Wed, Sep  7, 2011 at 01:20:17AM +0300, Ants Aasma wrote:
> > > On Tue, Sep 6, 2011 at 10:18 PM, Alvaro Herrera
> > >  wrote:
> > > > I wonder what happens if files in the same subdir are grouped in a
> > > > subgraph.  Is that possible?
> > > 
> > > Possible, and done. Also added possivility to add .c files to the graph,
> > > coloring by subdir and possibility exclude nodes from the graph. I didn't 
> > > yet
> > > bother to clean up the code - to avoid eye damage, don't look at the 
> > > source.
> > > 
> > > Bad news is that it doesn't significantly help readability for the all 
> > > nodes
> > > case. See all_with_subgraphs.svgz.  It does help for other cases.
> > > For example parsenodes.h.svgz has the result for
> > > render_includes.py --select='nodes/parsenodes.h+*-*' --subgraphs
> > > and execnodes.h.svgz for
> > > --subgraphs --select='nodes/execnodes.h+*-*'
> > 
> > Should we add this script and instructions to src/tools/pginclude?
> 
> Probably not, but maybe the developer FAQ in the wiki?

I just added the script email URL to the pginclude README.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unreproducible bug in snapshot import code

2012-08-16 Thread Bruce Momjian

On Fri, Oct 28, 2011 at 11:07:25AM -0400, Gurjeet Singh wrote:
> On Fri, Oct 28, 2011 at 10:11 AM, Bruce Momjian  wrote:
> 
> Gurjeet Singh wrote:
> > > > I have tried reproducing the bug starting from 1 and 2 transactions
> > > before
> > > > the one shown in snippet, and I used tab-completion to get the same
> > > > screen-output as termonal1.txt and yet it's not reproducible.
> > >
> > > I could reproduce it when I typed TAB just after typing "set" in "set
> > > transaction snapshot".
> > > As Tom and Alvaro pointed out, the tab-completion issues a query and
> which
> > > prevents the "set transaction snapshot" command.
> > >
> >
> > Great! That settles it then. Reproducible, but not a bug.
> 
> Yes, it is only tabs that query the database for completion that cause
> this.  Should this be documented somehow?  (No idea how.)
> 
> 
> If we have a doc section on psql's tab-completion, I think this needs to go
> there. A note like:
> 
> "Trying to tab-complete on psql may send queries to the server, and depending
> on the transaction state, execution of these queries may lead to non-default/
> unexpected behaviour by the queries executed after tab-completion. For 
> example,
> ..."

I have added the attached patch to document this limitation.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
new file mode 100644
index ebb0ad4..a47af51
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*** testdb=> \set PROMPT1 '%[%033[1;33;40
*** 3272,3278 
  exits and is reloaded when
  psql starts up. Tab-completion is also
  supported, although the completion logic makes no claim to be an
! SQL parser.  If for some reason you do not like the tab completion, you
  can turn it off by putting this in a file named
  .inputrc in your home directory:
  
--- 3272,3281 
  exits and is reloaded when
  psql starts up. Tab-completion is also
  supported, although the completion logic makes no claim to be an
! SQL parser.  The queries generated by tab-completion
! can also interfere with other SQL commands, e.g. SET
! TRANSACTION ISOLATION LEVEL.
! If for some reason you do not like the tab completion, you
  can turn it off by putting this in a file named
  .inputrc in your home directory:
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] State of the on-disk bitmap index

2012-08-16 Thread Daniel Bausch

Hello Jonah, Simon, and the hackers,

I am going to implement a simple kind of "encoded bitmap indexes" (EBI).
 That is an index type where the bitmap columns may not only contain
only a single '1' in the set of bits belonging to a tuple.  Instead, an
additional mapping table translates the distinct values of the table
column into a unique encoding.  To select for a given value all bitmap
columns must be compared instead of only one.  Queries that match
multiple different values (like IN lists or range queries) simplify to
less than the full set of bitmaps that needs to be compared because of
boolean logic.  The total number of bitmaps required to represent unique
encodings for all different values is ceil(ld(n)), where n is the number
of distinct values.  Compared to normal bitmap indexes this solves the
problem of high-cardinality columns.  It is targetet at data warehousing
scenarios with insert only data.

The respective scientific paper can be found at
http://www.dvs.tu-darmstadt.de/publications/pdf/ebi_a4.pdf

I thought, it could be a good idea to base my work on the long proposed
on-disk bitmap index implementation.  Regarding to the wiki, you, Jonah
and Simon, were the last devs that touched this thing.  Unfortunately I
could not find the patch representing your state of that work.  I could
only capture the development history up to Gianni Ciolli & Gabriele
Bartolini from the old pgsql-patches archives.  Other people involved
were Jie Zhang, Gavin Sherry, Heikki Linnakangas, and Leonardo F.  Are
you and the others still interested in getting this into PG?  A rebase
of the most current bitmap index implementation onto master HEAD will be
the first 'byproduct' that I am going to deliver back to you.

1. Is anyone working on this currently?
2. Who has got the most current source code?
3. Is there a git of that or will I need to reconstruct the history from
the patches I collected?

Disclaimer: Although I read and manually processed most of the
executor's code during my last work, I would still call myself a newbie
and therefore will need some assistance most probably.

Regards,
Daniel

-- 
Daniel Bausch
Wissenschaftlicher Mitarbeiter
Technische Universität Darmstadt
Fachbereich Informatik
Fachgebiet Datenbanken und Verteilte Systeme

Hochschulstraße 10
64289 Darmstadt
Germany

Tel.: +49 6151 16 6706
Fax:  +49 6151 16 6229


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Planner avoidance of index only scans for partial indexes

2012-08-16 Thread Josh Berkus

On 8/15/12 3:28 PM, Merlin Moncure wrote:
> Aside: the performance gains I'm seeing for IOS are nothing short of
> spectacular.

Do you have some metrics?  I could use them for publicity stuff.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TRUE/FALSE vs true/false

2012-08-16 Thread Bruce Momjian

On Thu, Aug 16, 2012 at 02:21:12PM -0500, Kevin Grittner wrote:
> Bruce Momjian  wrote:
>  
> > So what do we want to do with this?  I am a little concerned that
> > we are sacrificing code clarity for backpatching ease, but I don't
> > do as much backpatching as Tom.
>  
> Well, if you back-patched this change, it would eliminate the issue
> for Tom, wouldn't it?  Not sure if that's sane; just a thought.

I would be worried about some instability in backpatching.  I was
looking for an 'ignore-case' mode to patch, but I don't see it.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TRUE/FALSE vs true/false

2012-08-16 Thread Kevin Grittner

Bruce Momjian  wrote:

> So what do we want to do with this?  I am a little concerned that
> we are sacrificing code clarity for backpatching ease, but I don't
> do as much backpatching as Tom.

Well, if you back-patched this change, it would eliminate the issue
for Tom, wouldn't it?  Not sure if that's sane; just a thought.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TRUE/FALSE vs true/false

2012-08-16 Thread Bruce Momjian

On Tue, Aug 14, 2012 at 10:57:08PM -0400, Peter Eisentraut wrote:
> On Tue, 2012-08-14 at 17:36 -0400, Bruce Momjian wrote:
> > On Tue, Aug 14, 2012 at 05:34:02PM -0400, Tom Lane wrote:
> > > Bruce Momjian  writes:
> > > > On Thu, Aug  4, 2011 at 09:00:11PM +0300, Peter Eisentraut wrote:
> > > >> On tor, 2011-08-04 at 14:44 +0200, Boszormenyi Zoltan wrote:
> > > >>> I meant a mass "sed -e 's/TRUE/true/g'  -e 's/FALSE/false/g'" run
> > > >>> so all the ~200 occurrences of both "TRUE" and "FALSE" get
> > > >>> converted so the whole source tree is consistent.
> > > 
> > > >> I would be in favor of that.
> > > 
> > > > I have implemented this with the patch at:
> > > > http://momjian.us/expire/true_diff.txt
> > > 
> > > Does this really do anything for us that will justify the extra
> > > back-patching pain it will cause?  I don't see that it's improving
> > > code readability any.
> > 
> > I think it is more of a consistency issue.  There were multiple people
> > who wanted this change.  Of course, some of those people don't backport
> > stuff.
> 
> I guess you could argue this particular case without end, but I think we
> should be open to these kinds of changes.  They will make future code
> easier to deal with and confuse new developers less (when to use which,
> do they mean different things, etc.).
> 
> If back-patching really becomes a problem, we might want to look a
> little deeper into git options.  I think the Linux kernel people do
> these kinds of cleanups more often, so there is probably some better
> support for it.

So what do we want to do with this?  I am a little concerned that we are
sacrificing code clarity for backpatching ease, but I don't do as much
backpatching as Tom.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql \set vs \copy - bug or expected behaviour?

2012-08-16 Thread Bruce Momjian

On Fri, Oct 21, 2011 at 05:31:41PM -0400, Robert Haas wrote:
> On Fri, Oct 21, 2011 at 7:24 AM, Richard Huxton  wrote:
> > It looks like \copy is just passing the text of the query unadjusted to
> > "COPY". I get a syntax error on ":x" with the \copy below on both 9.0 and
> > 9.1
> >
> > === test script ===
> > \set x '''HELLO'''
> > -- Works
> > \echo :x
> > -- Works
> > \o '/tmp/test1.txt'
> > COPY (SELECT :x) TO STDOUT;
> > -- Doesn't work
> > \copy (SELECT :x) TO '/tmp/test2.txt'
> > === end script ===
> 
> I'm not sure whether that's a bug per se, but I can see where a
> behavior change might be an improvement.

I did some research on this and learned a little more about flex rules.

Turns out we can allow variable substitution in psql whole-line
commands, like \copy and \!, by sharing the variable expansion flex
rules with the code that does argument processing.  

What we can't easily do is to allow quotes to prevent variable
substitution in these whole-line commands because we can't process the
quotes because that will remove them.

Here are some examples;  \copy and \! behave the same:

test=> \set x abc
test=> \echo :x
abc
test=> \echo ":x"
--> ":x"
test=> \! echo :x
abc
test=> \! echo ":x"
--> abc

Notice the last line has expanded :x even though it is in quotes.

So, what do we want?  The attached patch is pretty short.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/bin/psql/psqlscan.l b/src/bin/psql/psqlscan.l
new file mode 100644
index 1208c8f..3732dc5
*** a/src/bin/psql/psqlscan.l
--- b/src/bin/psql/psqlscan.l
*** other			.
*** 934,949 
  
  }
  
! {
  	/*
  	 * Default processing of text in a slash command's argument.
  	 *
  	 * Note: unquoted_option_chars counts the number of characters at the
  	 * end of the argument that were not subject to any form of quoting.
  	 * psql_scan_slash_option needs this to strip trailing semicolons safely.
  	 */
  
! {space}|"\\"	{
  	/*
  	 * Unquoted space is end of arg; do not eat.  Likewise
  	 * backslash is end of command or next command, do not eat
--- 934,956 
  
  }
  
! 
  	/*
  	 * Default processing of text in a slash command's argument.
+ 	 * It shares token actions with xslasharg and xslashwholeline.
  	 *
  	 * Note: unquoted_option_chars counts the number of characters at the
  	 * end of the argument that were not subject to any form of quoting.
  	 * psql_scan_slash_option needs this to strip trailing semicolons safely.
  	 */
  
! {space}+	{
! 	/* process entire line, but suppress leading whitespace */
! 	if (output_buf->len > 0)
! 		ECHO;
! }
! 
! {space}|"\\"	{
  	/*
  	 * Unquoted space is end of arg; do not eat.  Likewise
  	 * backslash is end of command or next command, do not eat
*** other			.
*** 957,982 
  	return LEXRES_OK;
  }
  
! {quote}			{
! 	*option_quote = '\'';
! 	unquoted_option_chars = 0;
! 	BEGIN(xslashquote);
! }
! 
! "`"{
  	backtick_start_offset = output_buf->len;
  	*option_quote = '`';
  	unquoted_option_chars = 0;
  	BEGIN(xslashbackquote);
  }
  
! {dquote}		{
  	ECHO;
  	*option_quote = '"';
  	unquoted_option_chars = 0;
  	BEGIN(xslashdquote);
  }
  
  :{variable_char}+	{
  	/* Possible psql variable substitution */
  	if (option_type == OT_NO_EVAL)
--- 964,1005 
  	return LEXRES_OK;
  }
  
! "`"		{
! /* Only in xslasharg, so backticks are potentially passed to the shell */
  	backtick_start_offset = output_buf->len;
  	*option_quote = '`';
  	unquoted_option_chars = 0;
  	BEGIN(xslashbackquote);
  }
  
! {quote}			{
! 	*option_quote = '\'';
! 	unquoted_option_chars = 0;
! 	BEGIN(xslashquote);
! }
! 
! {dquote}		{
  	ECHO;
  	*option_quote = '"';
  	unquoted_option_chars = 0;
  	BEGIN(xslashdquote);
  }
  
+ {other}	{
+ 	unquoted_option_chars++;
+ 	ECHO;
+ }
+ 
+ {other}	{ ECHO; }
+ 
+ 	/*
+ 	 *	This code allows variable processing in slasharg and wholeline
+ 	 *	modes.  wholeline does not allow quoting to prevent variable
+ 	 *	subtitution because quote detection would remove the quotes.
+ 	 */
+  
+ {
+ 
  :{variable_char}+	{
  	/* Possible psql variable substitution */
  	if (option_type == OT_NO_EVAL)
*** other			.
*** 1044,1054 
  	ECHO;
  }
  
- {other}			{
- 	unquoted_option_chars++;
- 	ECHO;
- }
- 
  }
  
  {
--- 1067,1072 
*** other			.
*** 1115,1133 
  
  }
  
- {
- 	/* copy everything until end of input line */
- 	/* but suppress leading whitespace */
- 
- {space}+		{
- 	if (output_buf->len > 0)
- 		ECHO;
- }
- 
- {other}			{ ECHO; }
- 
- }
- 
  {
  	/* at end of command, eat a double backslash, bu

Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS

2012-08-16 Thread Fabrízio de Royes Mello

2012/8/16 David E. Wheeler 

> On Aug 16, 2012, at 10:36 AM, Fabrízio de Royes Mello wrote:
>
> > The attached patch implement this feature:
> >
> > CREATE SCHEMA [ IF NOT EXISTS ] schema_name [ AUTHORIZATION user_name ]
> [ schema_element [ ... ] ]
> > CREATE SCHEMA [ IF NOT EXISTS ] AUTHORIZATION user_name [ schema_element
> [ ... ] ]
> >
> > Now, PostgreSQL don't trow an error if we use "IF NOT EXISTS" in "CREATE
> SCHEMA" statement.
> >
> > So, I don't know the next steps...
>
> Awesome, thanks! Please add it to the next CommitFest:
>
>   https://commitfest.postgresql.org/action/commitfest_view?id=15
>
>
Patch added to CommitFest:

https://commitfest.postgresql.org/action/patch_view?id=907

Thanks,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello

Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS

2012-08-16 Thread David E. Wheeler

On Aug 16, 2012, at 10:36 AM, Fabrízio de Royes Mello wrote:

> The attached patch implement this feature:
> 
> CREATE SCHEMA [ IF NOT EXISTS ] schema_name [ AUTHORIZATION user_name ] [ 
> schema_element [ ... ] ]
> CREATE SCHEMA [ IF NOT EXISTS ] AUTHORIZATION user_name [ schema_element [ 
> ... ] ]
> 
> Now, PostgreSQL don't trow an error if we use "IF NOT EXISTS" in "CREATE 
> SCHEMA" statement.
> 
> So, I don't know the next steps...

Awesome, thanks! Please add it to the next CommitFest:

  https://commitfest.postgresql.org/action/commitfest_view?id=15

Best,

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS

2012-08-16 Thread Andrew Dunstan

On 08/16/2012 01:36 PM, Fabrízio de Royes Mello wrote:

2012/8/15 David E. Wheeler >

On Aug 15, 2012, at 11:31 AM, Fabrízio de Royes Mello wrote:

>> Is there any reason not to add $subject? Would it be difficult?
>
> Looking to the source code I think this feature isn't hard to
implement... I'm writing a little path to do that and I'll send
soon...

Cool, thanks!

The attached patch implement this feature:

CREATE SCHEMA [ IF NOT EXISTS ] schema_name [ AUTHORIZATION user_name 
] [ schema_element [ ... ] ]
CREATE SCHEMA [ IF NOT EXISTS ] AUTHORIZATION user_name [ 
schema_element [ ... ] ]

Now, PostgreSQL don't trow an error if we use "IF NOT EXISTS" in 
"CREATE SCHEMA" statement.

So, I don't know the next steps...

Please see 

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS

2012-08-16 Thread Fabrízio de Royes Mello

2012/8/15 David E. Wheeler 

> On Aug 15, 2012, at 11:31 AM, Fabrízio de Royes Mello wrote:
>
> >> Is there any reason not to add $subject? Would it be difficult?
> >
> > Looking to the source code I think this feature isn't hard to
> implement... I'm writing a little path to do that and I'll send soon...
>
> Cool, thanks!
>
>
The attached patch implement this feature:

CREATE SCHEMA [ IF NOT EXISTS ] schema_name [ AUTHORIZATION user_name ] [
schema_element [ ... ] ]
CREATE SCHEMA [ IF NOT EXISTS ] AUTHORIZATION user_name [ schema_element [
... ] ]

Now, PostgreSQL don't trow an error if we use "IF NOT EXISTS" in "CREATE
SCHEMA" statement.

So, I don't know the next steps...

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello


create_schema_if_not_exists.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] HeapTupleHeaderAdvanceLatestRemovedXid doing the wrong thing with multixacts

2012-08-16 Thread Alvaro Herrera

Excerpts from Bruce Momjian's message of jue ago 16 11:24:55 -0400 2012:
> On Mon, Oct 17, 2011 at 08:38:18PM +0100, Simon Riggs wrote:
> > On Mon, Oct 17, 2011 at 8:03 PM, Tom Lane  wrote:
> > > Alvaro Herrera  writes:
> > >> I just noticed that HeapTupleHeaderAdvanceLatestRemovedXid is comparing 
> > >> Xmax as a TransactionId without verifying whether it is a multixact or 
> > >> not.  Since they advance separately, this could lead to bogus answers.  
> > >> This probably needs to be fixed.  I didn't look into past releases to 
> > >> see if there's a live released bug here or not.
> > >
> > >> I think the fix is simply to ignore the Xmax if the HEAP_XMAX_IS_MULTI 
> > >> bit is set.
> > >
> > >> Additionally I think it should check HEAP_XMAX_INVALID before reading 
> > >> the Xmax at all.
> > >
> > > If it's failing to even check XMAX_INVALID, surely it's completely
> > > broken?  Perhaps it assumes its caller has checked all this?
> > 
> > HeapTupleHeaderAdvanceLatestRemovedXid() is only ever called when
> > HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DEAD, which only happens
> > when HEAP_XMAX_IS_MULTI is not set.
> > 
> > I'll add an assert to check this and a comment to explain.
> 
> Was this completed?

As far as I recall, there are changes related to this in my fklocks
patch.  I am hoping to have some review happen on it during the upcoming
commitfest (which presumably means I need to do a merge to newer
sources.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] The pgrminclude problem

2012-08-16 Thread Peter Geoghegan

On 16 August 2012 16:56, Bruce Momjian  wrote:
> Good to know. We only use pgrminclude very five years or so, and Tom
> isn't even keen on that.

Yeah. Even if this could be made to work well, we'd still have to do
something like get an absolute consensus from all build farm animals,
if we expected to have an absolutely trustworthy list. I don't think
pgrminclude is a bad idea. I just think that it should only be used to
guide the efforts of a human to remove superfluous #includes, which is
how it is used anyway.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: In docs, change a few cases of "not important" to "unimportant".

2012-08-16 Thread Bruce Momjian

On Thu, Aug 16, 2012 at 10:38:09AM -0500, Kevin Grittner wrote:
> Heikki Linnakangas  wrote:
> > On 16.08.2012 17:49, Bruce Momjian wrote:
> >> Heikki Linnakangas wrote:
> >>> On 16.08.2012 17:36, Bruce Momjian wrote:
>  In docs, change a few cases of "not important" to
>  "unimportant".
> >>>
> >>> FWIW, I don't think these changes were an improvement. I find
> >>> "not important" more readable in those sentences.
>  
> +1
>  
> >> Really?  All of them?
> > 
> > Yes.. Perhaps "not important" just feels better to me in general
> > than "unimportant".
>  
> Yeah, it's probably subjective, but for me "unimportant" has
> negative, dismissive overtones, while "not important" does not. 
> Sort of like the way "unpleasant" has a different meaning than "not
> pleasant".

OK, thanks for the feedback.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cataloguing NOT NULL constraints

2012-08-16 Thread Kevin Grittner

Alvaro Herrera  wrote:
 
> I think that a NOT NULL constraint attached to a column with a
> composite type is equivalent to a CHECK (col IS DISTINCT FROM
> NULL); at least they seem to behave identically.  Is that what you
> would expect?
 
I had not thought about that, but now that you point it out I think
that interpretation makes more sense than any other.  In a quick
test they behaved identically for me.
 
> This seems a bit complicated to handle with the way I'm doing
> things today; at parse analysis time, when my current code is
> creating the check constraint, we don't know anything about the
> type of the column IIRC.  Maybe I will have to delay creating the
> constraint until execution.
 
Why?  CHECK (col IS DISTINCT FROM NULL) works correctly for *any*
type, doesn't it?
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] HeapTupleHeaderAdvanceLatestRemovedXid doing the wrong thing with multixacts

2012-08-16 Thread Alvaro Herrera

Excerpts from Bruce Momjian's message of jue ago 16 11:44:48 -0400 2012:
> On Thu, Aug 16, 2012 at 11:38:14AM -0400, Alvaro Herrera wrote:
> > Excerpts from Bruce Momjian's message of jue ago 16 11:24:55 -0400 2012:
> > > On Mon, Oct 17, 2011 at 08:38:18PM +0100, Simon Riggs wrote:
> > > > On Mon, Oct 17, 2011 at 8:03 PM, Tom Lane  wrote:
> > > > > Alvaro Herrera  writes:
> > > > >> I just noticed that HeapTupleHeaderAdvanceLatestRemovedXid is 
> > > > >> comparing Xmax as a TransactionId without verifying whether it is a 
> > > > >> multixact or not.  Since they advance separately, this could lead to 
> > > > >> bogus answers.  This probably needs to be fixed.  I didn't look into 
> > > > >> past releases to see if there's a live released bug here or not.
> > > > >
> > > > >> I think the fix is simply to ignore the Xmax if the 
> > > > >> HEAP_XMAX_IS_MULTI bit is set.
> > > > >
> > > > >> Additionally I think it should check HEAP_XMAX_INVALID before 
> > > > >> reading the Xmax at all.
> > > > >
> > > > > If it's failing to even check XMAX_INVALID, surely it's completely
> > > > > broken?  Perhaps it assumes its caller has checked all this?
> > > > 
> > > > HeapTupleHeaderAdvanceLatestRemovedXid() is only ever called when
> > > > HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DEAD, which only happens
> > > > when HEAP_XMAX_IS_MULTI is not set.
> > > > 
> > > > I'll add an assert to check this and a comment to explain.
> > > 
> > > Was this completed?
> > 
> > As far as I recall, there are changes related to this in my fklocks
> > patch.  I am hoping to have some review happen on it during the upcoming
> > commitfest (which presumably means I need to do a merge to newer
> > sources.)
> 
> I was asking about adding the assert check --- does that need to wait
> too?

I don't think it's worth fussing about.  Also I don't need any more
merge conflicts than I already have.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] The pgrminclude problem

2012-08-16 Thread Bruce Momjian

On Thu, Aug 16, 2012 at 04:48:34PM +0100, Peter Geoghegan wrote:
> I found a tool that Google authored that attempts to solve the same
> problems as pgrminclude does in our codebase.  It's called "include
> what you use", and is based on Clang. The project is hosted here:
> 
> http://code.google.com/p/include-what-you-use/
> 
> I'm not suggesting that we should start using this tool instead of
> pgrminclude, because it has enough caveats of its own, and is mostly
> written with Google's C++ codebase in mind. However, it's worth being
> aware of. There is a useful analysis of the general problem in the
> README - "Why IWYU is Difficult" (you can probably just skip the
> extensive analysis of C++ templates as they relate to the problem
> there).
> 
> The tool is authored by Craig Silverstein, Google's director of
> technology. If he believes that IWYU is a difficult problem, well, it
> probably is.

Good to know. We only use pgrminclude very five years or so, and Tom
isn't even keen on that.

I added the URL to our src/tools/pginclude/README.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] The pgrminclude problem

2012-08-16 Thread Peter Geoghegan

I found a tool that Google authored that attempts to solve the same
problems as pgrminclude does in our codebase.  It's called "include
what you use", and is based on Clang. The project is hosted here:

http://code.google.com/p/include-what-you-use/

I'm not suggesting that we should start using this tool instead of
pgrminclude, because it has enough caveats of its own, and is mostly
written with Google's C++ codebase in mind. However, it's worth being
aware of. There is a useful analysis of the general problem in the
README - "Why IWYU is Difficult" (you can probably just skip the
extensive analysis of C++ templates as they relate to the problem
there).

The tool is authored by Craig Silverstein, Google's director of
technology. If he believes that IWYU is a difficult problem, well, it
probably is.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] HeapTupleHeaderAdvanceLatestRemovedXid doing the wrong thing with multixacts

2012-08-16 Thread Bruce Momjian

On Thu, Aug 16, 2012 at 11:38:14AM -0400, Alvaro Herrera wrote:
> Excerpts from Bruce Momjian's message of jue ago 16 11:24:55 -0400 2012:
> > On Mon, Oct 17, 2011 at 08:38:18PM +0100, Simon Riggs wrote:
> > > On Mon, Oct 17, 2011 at 8:03 PM, Tom Lane  wrote:
> > > > Alvaro Herrera  writes:
> > > >> I just noticed that HeapTupleHeaderAdvanceLatestRemovedXid is 
> > > >> comparing Xmax as a TransactionId without verifying whether it is a 
> > > >> multixact or not.  Since they advance separately, this could lead to 
> > > >> bogus answers.  This probably needs to be fixed.  I didn't look into 
> > > >> past releases to see if there's a live released bug here or not.
> > > >
> > > >> I think the fix is simply to ignore the Xmax if the HEAP_XMAX_IS_MULTI 
> > > >> bit is set.
> > > >
> > > >> Additionally I think it should check HEAP_XMAX_INVALID before reading 
> > > >> the Xmax at all.
> > > >
> > > > If it's failing to even check XMAX_INVALID, surely it's completely
> > > > broken?  Perhaps it assumes its caller has checked all this?
> > > 
> > > HeapTupleHeaderAdvanceLatestRemovedXid() is only ever called when
> > > HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DEAD, which only happens
> > > when HEAP_XMAX_IS_MULTI is not set.
> > > 
> > > I'll add an assert to check this and a comment to explain.
> > 
> > Was this completed?
> 
> As far as I recall, there are changes related to this in my fklocks
> patch.  I am hoping to have some review happen on it during the upcoming
> commitfest (which presumably means I need to do a merge to newer
> sources.)

I was asking about adding the assert check --- does that need to wait
too?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: In docs, change a few cases of "not important" to "unimportant".

2012-08-16 Thread Kevin Grittner

Heikki Linnakangas  wrote:
> On 16.08.2012 17:49, Bruce Momjian wrote:
>> Heikki Linnakangas wrote:
>>> On 16.08.2012 17:36, Bruce Momjian wrote:
 In docs, change a few cases of "not important" to
 "unimportant".
>>>
>>> FWIW, I don't think these changes were an improvement. I find
>>> "not important" more readable in those sentences.
 
+1
 
>> Really?  All of them?
> 
> Yes.. Perhaps "not important" just feels better to me in general
> than "unimportant".
 
Yeah, it's probably subjective, but for me "unimportant" has
negative, dismissive overtones, while "not important" does not. 
Sort of like the way "unpleasant" has a different meaning than "not
pleasant".
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] HeapTupleHeaderAdvanceLatestRemovedXid doing the wrong thing with multixacts

2012-08-16 Thread Bruce Momjian

On Mon, Oct 17, 2011 at 08:38:18PM +0100, Simon Riggs wrote:
> On Mon, Oct 17, 2011 at 8:03 PM, Tom Lane  wrote:
> > Alvaro Herrera  writes:
> >> I just noticed that HeapTupleHeaderAdvanceLatestRemovedXid is comparing 
> >> Xmax as a TransactionId without verifying whether it is a multixact or 
> >> not.  Since they advance separately, this could lead to bogus answers.  
> >> This probably needs to be fixed.  I didn't look into past releases to see 
> >> if there's a live released bug here or not.
> >
> >> I think the fix is simply to ignore the Xmax if the HEAP_XMAX_IS_MULTI bit 
> >> is set.
> >
> >> Additionally I think it should check HEAP_XMAX_INVALID before reading the 
> >> Xmax at all.
> >
> > If it's failing to even check XMAX_INVALID, surely it's completely
> > broken?  Perhaps it assumes its caller has checked all this?
> 
> HeapTupleHeaderAdvanceLatestRemovedXid() is only ever called when
> HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DEAD, which only happens
> when HEAP_XMAX_IS_MULTI is not set.
> 
> I'll add an assert to check this and a comment to explain.

Was this completed?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Underspecified window queries in regression tests

2012-08-16 Thread Bruce Momjian


I have used your notes below to rewrite the Window function SQL manual
section.  As you said, it was very hard to read.  I now understand it
better, having restructured it, and I hope others do too.

After waiting 30 minutes for our developer doc build to refresh, I am
giving up and posting my own URL for the doc changes:

http://momjian.us/tmp/pgsql/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS

Perhaps I need to go back to having my own doc build.

---

On Mon, Oct 17, 2011 at 11:48:38AM +0200, Florian Pflug wrote:
> On Oct17, 2011, at 01:09 , Tom Lane wrote:
> > Florian Pflug  writes:
> >> ... reading those parts again, I realize the it says "When ORDER BY is 
> >> omitted
> >> the *default* frame consists ... ", and that the second quote is followed
> >> by a footnote which says
> > 
> >>  There are options to define the window frame in other ways, but this 
> >> tutorial
> >>  does not cover them. See Section 4.2.8 for details. [3.5, Window 
> >> Functions]
> > 
> >> So it was just me being thick. Sorry for the noise.
> > 
> > Hmm.  Maybe the use of a  there is too subtle, and we should
> > instead have that text in-line (probably in parentheses)?  Or we could
> > use a , but that's probably too much emphasis.
> 
> Inline and in parentheses sounds fine.
> 
> In addition, I think we should reword the explanation in 4.2.8 (The SQL 
> Language
> / SQL Syntax / Value Expressions / Window Functions). Instead of that rather
> long (and IMHO hard to read) paragraph about possible frame clauses and their
> behaviour in the presence or absence of an ORDER BY clause, we should go with
> a more algorithmic explanation I think.
> 
> Something along these lines maybe:
> 
> --
> .) PARTITION BY splits the rows into disjoint partitions. All further 
> processing
>happens only inside a single partition
> 
> .) In RANGE mode, ORDER BY then splits each partition into an ordered list of
>sub-partitions, each containing rows which the ORDER BY considers to be
>equivalent.
> 
> .) In ROWS mode, OTOH, each sub-partition contains only a single row. Thus, if
>there are rows which are considered to be equivalent by the ORDER BY, the
>ordering of the sub-partition isn't fully determined.
> 
> .) Each row's frame then consists of some consecutive range of sub-partitions.
> 
> .) In RANGE mode, that consecutive range can only start at either the first
>sub-partition or the current row's sub-partition, and can only end at 
> either
>the current row's sub-partition or the last sub-partitions.
> 
> .) In ROWS mode, the consecutive range may additional start  sub-partitions
>(or rows, it's the same thing here) before the current row, and may 
> additionally
>end  sub-partitions/rows after the current row.
> 
> >From that, it follows that even with an underspecified sort order, the 
> >contents of
> each frame are still fully determined in RANGE mode. The ordering of rows 
> within
> a frame is not determined, though. So overall, in RANGE mode, a query's 
> result is
> only non-deterministic if the window function is sensitive to the ordering of 
> rows
> within a frame.
> 
> In ROWS mode, OTOH, the contents each frame themselves are not fully 
> determined,
> so even an ordering agnostic window function may produce non-deterministic 
> results.
> --
> 
> If you think that something along these lines would be an improvement, I can 
> try
> to come up with a patch.
> 
> best regards,
> Florian Pflug
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Clean up the #include mess a little.

2012-08-16 Thread Alvaro Herrera

Excerpts from Bruce Momjian's message of mié ago 15 18:30:40 -0400 2012:
> 
> On Wed, Sep  7, 2011 at 01:20:17AM +0300, Ants Aasma wrote:
> > On Tue, Sep 6, 2011 at 10:18 PM, Alvaro Herrera
> >  wrote:
> > > I wonder what happens if files in the same subdir are grouped in a
> > > subgraph.  Is that possible?
> > 
> > Possible, and done. Also added possivility to add .c files to the graph,
> > coloring by subdir and possibility exclude nodes from the graph. I didn't 
> > yet
> > bother to clean up the code - to avoid eye damage, don't look at the source.
> > 
> > Bad news is that it doesn't significantly help readability for the all nodes
> > case. See all_with_subgraphs.svgz.  It does help for other cases.
> > For example parsenodes.h.svgz has the result for
> > render_includes.py --select='nodes/parsenodes.h+*-*' --subgraphs
> > and execnodes.h.svgz for
> > --subgraphs --select='nodes/execnodes.h+*-*'
> 
> Should we add this script and instructions to src/tools/pginclude?

Probably not, but maybe the developer FAQ in the wiki?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UNION ALL with WHERE clause does not use Merge Append

2012-08-16 Thread Tom Lane

Marti Raudsepp  writes:
> Is this just a planner shortcoming or a bug? Or is there some
> justification for this behavior?

Per the comment in is_safe_append_member():

 * It's only safe to pull up the child if its jointree contains exactly
 * one RTE, else the AppendRelInfo data structure breaks. The one base RTE
 * could be buried in several levels of FromExpr, however.
 *
 * Also, the child can't have any WHERE quals because there's no place to
 * put them in an appendrel.  (This is a bit annoying...)

This will probably get fixed someday, but I wouldn't recommend holding
your breath for it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cataloguing NOT NULL constraints

2012-08-16 Thread Alvaro Herrera

Excerpts from Kevin Grittner's message of jue ago 02 10:48:02 -0400 2012:
> 
> "Kevin Grittner"  wrote:
>  
> > Don't forget the peculiarities of columns with record types. 
>  
> I forgot to include the type creation in the example:
>  
> test=# create type a as (a1 int, a2 int);
> CREATE TYPE

Thanks for the example.  After playing with this, I think that a NOT
NULL constraint attached to a column with a composite type is equivalent
to a CHECK (col IS DISTINCT FROM NULL); at least they seem to behave
identically.  Is that what you would expect?

This seems a bit complicated to handle with the way I'm doing things
today; at parse analysis time, when my current code is creating the
check constraint, we don't know anything about the type of the column
IIRC.  Maybe I will have to delay creating the constraint until
execution.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-16 Thread Heikki Linnakangas


On 15.08.2012 11:34, Alexander Korotkov wrote:

On Wed, Aug 15, 2012 at 12:14 PM, Heikki Linnakangas<
heikki.linnakan...@enterprisedb.com>  wrote:


Histogram of upper bounds would be both more

accurate and natural for some operators. However, it requires collecting
additional statistics while AFAICS it doesn't liberate us from having
histogram of range lengths.


Hmm, if we collected a histogram of lower bounds and a histogram of upper
bounds, that would be roughly the same amount of data as for the "standard"
histogram with both bounds in the same histogram.


Ok, we've to decide if we need "standard" histogram. In some cases it can
be used for more accurate estimation of<  and>  operators.
But I think it is not so important. So, we can replace "standard" histogram
with histograms of lower and upper bounds?


Yeah, I think that makes more sense. The lower bound histogram is still 
useful for < and > operators, just not as accurate if there are lots of 
values with the same lower bound but different upper bound.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-08-16 Thread Heikki Linnakangas


On 15.08.2012 11:34, Alexander Korotkov wrote:

On Wed, Aug 15, 2012 at 12:14 PM, Heikki Linnakangas<
heikki.linnakan...@enterprisedb.com>  wrote:


Histogram of upper bounds would be both more

accurate and natural for some operators. However, it requires collecting
additional statistics while AFAICS it doesn't liberate us from having
histogram of range lengths.


Hmm, if we collected a histogram of lower bounds and a histogram of upper
bounds, that would be roughly the same amount of data as for the "standard"
histogram with both bounds in the same histogram.


Ok, we've to decide if we need "standard" histogram. In some cases it can
be used for more accurate estimation of<  and>  operators.
But I think it is not so important. So, we can replace "standard" histogram
with histograms of lower and upper bounds?


Yeah, I think that makes more sense. The lower bound histogram is still 
useful for < and > operators, just not as accurate if there are lots of 
values with the same lower bound but different upper bound.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SP-GiST for ranges based on 2d-mapping and quad-tree

2012-08-16 Thread Heikki Linnakangas


On 09.08.2012 18:42, Alexander Korotkov wrote:

In this revision of patch I tried to handle conditions more generally using
variables minLower, maxLower, minUpper, maxUpper, inclusive and
strictEmpty. However some strategies still contain additional logic.


Thanks, that clarified the code tremendously. The comments I added about 
the geometrical interpretations of the operations earlier seem 
unnecessary now, so removed those.



What is our conclusion about saving previous choice for RANGESTRAT_ADJACENT
strategy?


I think we're going to do what you did in the patch. A more generic 
mechanism for holding private state across consistent calls would be 
nice, but it's not that ugly the way you wrote it.


I committed the patch now, but left out the support for adjacent for 
now. Not because there was necessarily anything wrong with that, but 
because I have limited time for reviewing, and the rest of the patch 
looks ready for commit now. I reworded the comments quite a lot, you 
might want to proofread those to double-check that they're still 
correct. I'll take a look at the adjacent-support next, as a separate patch.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ToDo: allow to get a number of processed rows by COPY statement

2012-08-16 Thread Pavel Stehule

Hello

here is updated patch

postgres=# copy omega to '/tmp/xxx';
COPY 60
postgres=# do $$ declare r int;
begin
   copy omega from '/tmp/xxx'; get diagnostics r = row_count; raise
notice '>>> %', r;
end;
$$ language plpgsql;
NOTICE:  >>> 60
DO

Regards

Pavel

2012/8/16 Bruce Momjian :
>
> What ever happened to this patch?  I don't see it on any of the
> commit-fests, though someone was asked for it to be added:
>
> http://archives.postgresql.org/pgsql-hackers/2011-10/msg00381.php
>
> ---
>
> On Tue, Oct  4, 2011 at 12:22:19PM +0200, Pavel Stehule wrote:
>> Hello
>>
>> There is not possible to get a number of processed rows when COPY is
>> evaluated via SPI. Client can use a tag, but SPI doesn't use a tag.
>>
>> I propose a small change a ProcessUtility to return a processed rows.
>>
>> Note: I found a small inconsistency between SPI and Utility interface.
>> SPI still use a 4 byte unsign int for storing a number of processed
>> rows. Utility use a 8bytes unsign int.
>>
>> Motivation:
>>
>>  postgres=# \sf fx
>> CREATE OR REPLACE FUNCTION public.fx(tablename text, filename text)
>>  RETURNS integer
>>  LANGUAGE plpgsql
>> AS $function$
>> declare r int;
>> begin
>>   execute format('COPY %s FROM %s', quote_ident(tablename),
>> quote_literal(filename));
>>   get diagnostics r = row_count;
>>   return r;
>> end;
>> $function$
>>
>> Regards
>>
>> Pavel Stehule
>
>> diff --git a/src/backend/executor/functions.c 
>> b/src/backend/executor/functions.c
>> index 398bc40..a7c2b8f 100644
>> --- a/src/backend/executor/functions.c
>> +++ b/src/backend/executor/functions.c
>> @@ -600,6 +600,7 @@ postquel_getnext(execution_state *es, 
>> SQLFunctionCachePtr fcache)
>>  es->qd->params,
>>  false,   /* not top level */
>>  es->qd->dest,
>> +NULL,
>>  NULL);
>>   result = true;  /* never stops early */
>>   }
>> diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
>> index 688279c..21cabcc 100644
>> --- a/src/backend/executor/spi.c
>> +++ b/src/backend/executor/spi.c
>> @@ -1838,6 +1838,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo 
>> paramLI,
>>   {
>>   Node   *stmt = (Node *) lfirst(lc2);
>>   boolcanSetTag;
>> + boolisCopyStmt = false;
>>   DestReceiver *dest;
>>
>>   _SPI_current->processed = 0;
>> @@ -1857,6 +1858,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo 
>> paramLI,
>>   {
>>   CopyStmt   *cstmt = (CopyStmt *) stmt;
>>
>> + isCopyStmt = true;
>>   if (cstmt->filename == NULL)
>>   {
>>   my_res = SPI_ERROR_COPY;
>> @@ -1911,16 +1913,23 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo 
>> paramLI,
>>   }
>>   else
>>   {
>> + uint32 processed;
>> +
>>   ProcessUtility(stmt,
>>  
>> plansource->query_string,
>>  paramLI,
>>  false,   /* not 
>> top level */
>>  dest,
>> -NULL);
>> +NULL,
>> +&processed);
>>   /* Update "processed" if stmt returned tuples 
>> */
>> +
>>   if (_SPI_current->tuptable)
>>   _SPI_current->processed = 
>> _SPI_current->tuptable->alloced -
>>   _SPI_current->tuptable->free;
>> + else if (canSetTag && isCopyStmt)
>> + _SPI_current->processed = processed;
>> +
>>   res = SPI_OK_UTILITY;
>>   }
>>
>> diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
>> index 466727b..1a861ee 100644
>> --- a/src/backend/tcop/pquery.c
>> +++ b/src/backend/tcop/pquery.c
>> @@ -1184,7 +1184,8 @@ PortalRunUtility(Portal portal, Node *utilityStmt, 
>> bool isTopLevel,
>>  portal->portalParams,
>>  isTopLevel,
>>  dest,
>> -comple

[HACKERS] UNION ALL with WHERE clause does not use Merge Append

2012-08-16 Thread Marti Raudsepp

Hi list,

I have a table with (start_time, end_time) and I'd want to tally up
the number of concurrent connections at any point in time. The "Merge
Append" plan node introduced in 9.1 would be perfect for this purpose.
It seems to work out fine for the most trivial case, but just when I
add an WHERE clause to any of the UNION subqueries, this stops
working. Tested on 9.1.4 and 9.2beta3

Here's a simplified test case:

create table foo as select generate_series(1,100) i;
create index on foo(i);
vacuum analyze foo;

This works as expected:
EXPLAIN ANALYZE
  select i from foo
UNION ALL
  select i from foo
ORDER BY 1 LIMIT 100;

 Limit  (cost=0.01..3.31 rows=100 width=4) (actual time=0.028..0.078
rows=100 loops=1)
   ->  Result  (cost=0.01..65981.61 rows=200 width=4) (actual
time=0.026..0.064 rows=100 loops=1)
 ->  Merge Append  (cost=0.01..65981.61 rows=200 width=4)
(actual time=0.026..0.053 rows=100 loops=1)
   Sort Key: public.foo.i
   ->  Index Only Scan using foo_i_idx on foo
(cost=0.00..20490.80 rows=100 width=4) (actual time=0.017..0.021
rows=51 loops=1)
 Heap Fetches: 0
   ->  Index Only Scan using foo_i_idx on foo
(cost=0.00..20490.80 rows=100 width=4) (actual time=0.007..0.012
rows=50 loops=1)
 Heap Fetches: 0
 Total runtime: 0.106 ms


But once I add even a basic WHERE clause, suddenly it decides that
sorting is the only way:
EXPLAIN ANALYZE
  select i from foo where i is not null
UNION ALL
  select i from foo where i is not null
ORDER BY 1 LIMIT 100;

 Limit  (cost=127250.56..127250.81 rows=100 width=4) (actual
time=1070.799..1070.812 rows=100 loops=1)
   ->  Sort  (cost=127250.56..132250.56 rows=200 width=4) (actual
time=1070.798..1070.804 rows=100 loops=1)
 Sort Key: public.foo.i
 Sort Method: top-N heapsort  Memory: 29kB
 ->  Result  (cost=0.00..50812.00 rows=200 width=4)
(actual time=0.009..786.806 rows=200 loops=1)
   ->  Append  (cost=0.00..50812.00 rows=200 width=4)
(actual time=0.007..512.201 rows=200 loops=1)
 ->  Seq Scan on foo  (cost=0.00..15406.00
rows=100 width=4) (actual time=0.007..144.872 rows=100
loops=1)
   Filter: (i IS NOT NULL)
 ->  Seq Scan on foo  (cost=0.00..15406.00
rows=100 width=4) (actual time=0.003..139.196 rows=100
loops=1)
   Filter: (i IS NOT NULL)
 Total runtime: 1070.847 ms


Works again when I stuff it in a subquery and put WHERE above the
UNION. But this loses flexibility -- I can't filter the subqueries
with different clauses any more:

EXPLAIN ANALYZE
select * from (
select i from foo
  UNION ALL
select i from foo
) subq where i is not null
ORDER BY 1 LIMIT 100;

 Limit  (cost=0.01..3.56 rows=100 width=4) (actual time=0.033..0.088
rows=100 loops=1)
   ->  Result  (cost=0.01..70981.61 rows=200 width=4) (actual
time=0.032..0.071 rows=100 loops=1)
 ->  Merge Append  (cost=0.01..70981.61 rows=200 width=4)
(actual time=0.031..0.059 rows=100 loops=1)
   Sort Key: public.foo.i
   ->  Index Only Scan using foo_i_idx on foo
(cost=0.00..22990.80 rows=100 width=4) (actual time=0.020..0.025
rows=51 loops=1)
 Index Cond: (i IS NOT NULL)
 Heap Fetches: 0
   ->  Index Only Scan using foo_i_idx on foo
(cost=0.00..22990.80 rows=100 width=4) (actual time=0.010..0.014
rows=50 loops=1)
 Index Cond: (i IS NOT NULL)
 Heap Fetches: 0
 Total runtime: 0.115 ms


Is this just a planner shortcoming or a bug? Or is there some
justification for this behavior?

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] feature request: auto savepoint for interactive psql when in transaction.

2012-08-16 Thread Daniel Farina

On Mon, Nov 14, 2011 at 3:05 PM, Ross Reedstrom  wrote:
> On Mon, Nov 14, 2011 at 02:45:04PM -0800, Will Leinweber wrote:
>> My coworker Dan suggested that some people copy and paste scripts. However
>> I feel that that is an orthogonal problem and if there is a very high rate
>> of input psql should detect that and turn interactive off. And I
>> still strongly feel that on_error_rollback=interactive should be the
>> default.
>
> Hmm, I think that falls under the "don't so that, then" usecase. I've been
> known to c&p the occasional script - I guess the concern here would be not
> seeing failed steps that scrolled off the terminal. (I set my scrollback to
> basically infinity and actaully use it, but then I'm strange that way :-) )

I do this and have done this all the time.  Because emacs.  On the
other hand, I only really do it in line-buffered modes.  I also feel
there is something of a development/production parity that is broken
by this, but then again, so are backslash commands interpreted by
psql, and that has never proven to be a practical problem.  That I know of.

I wouldn't let my particular use case (M-x shell) get in the way of
changing the default if that was the consensus because I think this
would help a lot more people than hurt.  In the discoverability
department, one can not-hack the server error message by having psql
emit its own hint when it receives an error while in a transaction
block (vs auto-commit).  This is knowable via the ReadyForQuery
message, which can tell you "idle in transaction".

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

49 matches

Mail list logo