Re: [HACKERS] recent ALTER whatever .. SET SCHEMA refactoring

2013-01-08 Thread Kohei KaiGai
2013/1/7 Robert Haas robertmh...@gmail.com:
 On Mon, Jan 7, 2013 at 2:14 PM, Alvaro Herrera alvhe...@2ndquadrant.com 
 wrote:
 Kohei KaiGai escribió:

 Function and collation are candidates of this special case handling;
 here are just two kinds of object.

 Another idea is to add a function-pointer as argument of
 AlterNamespace_internal for (upcoming) object classes that takes
 special handling for detection of name collision.
 My personal preference is the later one, rather than hardwired
 special case handling.
 However, it may be too elaborate to handle just two exceptions.

 I think this idea is fine.  Pass a function pointer which is only
 not-NULL for the two exceptional cases; the code should have an Assert
 that either the function pointer is passed, or there is a nameCacheId to
 use.  That way, the object types we already handle in the simpler way do
 not get any more complicated than they are today, and we're not forced
 to create useless callbacks for objects were the lookup is trivial.  The
 function pointer should return boolean, true when the function/collation
 is already in the given schema; that way, the message wording is only
 present in AlterObjectNamespace_internal.

 It seems overly complex to me.  What's wrong with putting special-case
 logic directly into the function?  That seems cleaner and easier to
 understand, and there's no real downside AFAICS.  We have similar
 special cases elsewhere; the code can't be simpler than the actual
 logic.

Does it make sense an idea to invoke AlterFunctionNamespace_oid()
or AlterCollationNamespace_oid() from AlterObjectNamespace_internal()
for checks of namespace conflicts?
It can handle special cases with keeping modularity between common
and specific parts. Let's consider function pointer when we have mode
than 5 object classes that needs special treatment.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-08 Thread Takeshi Yamamuro

Hi,

 Why would that be a good tradeoff to make? Larger stored values 
require

 more I/O, which is likely to swamp any CPU savings in the compression
 step. Not to mention that a value once written may be read many times,
 so the extra I/O cost could be multiplied many times over later on.
 I agree with this analysis, but I note that the test results show it
 actually improving things along both parameters.
 Hm ... one of us is reading those results backwards, then.
I think that it's a parameter-tuning issue.
I added the two parameters, PGLZ_SKIP_SIZE and PGLZ_HASH_GAP, and
set PGLZ_SKIP_SIZE=3 and PGLZ_HASH_GAP=8 for the quick tests.
And also, I found that the performance in my patch was nearly
equal to that in the current implementation when
PGLZ_SKIP_SIZE=1 and PGLZ_HASH_GAP=1.

Apart from my patch, what I care is that the current one might
be much slow against I/O. For example, when compressing
and writing large values, compressing data (20-40MiB/s) might be
a dragger against writing data in disks (50-80MiB/s). Moreover,
IMHO modern (and very fast) I/O subsystems such as SSD make a
bigger issue in this case.

Then, I think it's worth keeping discussions to improve
compression stuffs for 9.4, or later.


 Another thing to keep in mind is that the compression area in general
 is a minefield of patents.  We're fairly confident that pg_lzcompress
 as-is doesn't fall foul of any, but any significant change there would
 probably require more research.
Agree, and we know ...
we need to have patent-free ideas to improve compression issues.
For example, pluggable compression IF, or something.


 I just went back and looked. Unless I'm misreading it he has about a 2.5
 times speed improvement but about a 20% worse compression result.

 What would be interesting would be to see if the knobs he's tweaked
 could be tweaked a bit more to give us substantial speedup without
 significant space degradation.
Yes, you're right, and these results highly depend
on data sets though.


regards,
--

Takeshi Yamamuro
NTT Cyber Communications Laboratory Group
Software Innovation Center
(Open Source Software Center)
Tel: +81-3-5860-5057 Fax: +81-3-5463-5490
Mail:yamamuro.take...@lab.ntt.co.jp


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-08 Thread Takeshi Yamamuro



So why don't we use LZ4?


+1

Agree though, I think there're still patent issues there.

regards,
--

Takeshi Yamamuro
NTT Cyber Communications Laboratory Group
Software Innovation Center
(Open Source Software Center)
Tel: +81-3-5860-5057 Fax: +81-3-5463-5490
Mail:yamamuro.take...@lab.ntt.co.jp


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lazy_vacuum_heap()'s removal of HEAPTUPLE_DEAD tuples

2013-01-08 Thread Pavan Deolasee
On Tue, Jan 8, 2013 at 8:19 AM, Noah Misch n...@leadboat.com wrote:


 At that point in the investigation, I realized that the cost of being able
 to
 remove entire tuples in lazy_vacuum_heap() greatly exceeds the benefit.
 Again, the benefit is being able to remove tuples whose inserting
 transaction
 aborted between the HeapTupleSatisfiesVacuum() call in heap_page_prune()
 and
 the one in lazy_scan_heap().  To make that possible, lazy_vacuum_heap()
 grabs
 a cleanup lock, calls PageRepairFragmentation(), and emits a WAL record for
 every page containing LP_DEAD line pointers or HEAPTUPLE_DEAD tuples.  If
 we
 take it out of the business of removing tuples, lazy_vacuum_heap() can skip
 WAL and update lp_flags under a mere shared lock.  The second attached
 patch,
 for HEAD, implements that.  Besides optimizing things somewhat, it
 simplifies
 the code and removes rarely-tested branches.  (This patch supersedes the
 backpatch-oriented patch rather than stacking with it.)


+1. I'd also advocated removing the line pointer array scan in
lazy_scan_heap() because the heap_page_prune() does most of the work. The
reason why we still have that scan in lazy_scan_heap() is to check for new
dead tuples (which should be rare), check for all-visibility of the page
and freeze tuples if possible. I think we can move all of that to
heap_page_prune().

But while you are at it, I wonder if you have time and energy to look at
the single pass vacuum work that I, Robert and others tried earlier but
failed to get to the final stage [1][2]. If we do something like that, we
might not need the second scan of the heap at all, which as you rightly
pointed out, does not do much today anyway, other than marking LP_DEAD line
pointers to LP_UNUSED. We can push that work to the next vacuum or even a
foreground task.

BTW, with respect to your idea of not WAL logging the operation of setting
LP_DEAD to LP_UNUSED in lazy_vacuum_page(), I wonder if this would create
problems with crash recovery. During normal vacuum operation, you may have
converted LP_DEAD to LP_UNUSED. Then you actually used one of those line
pointers for subsequent PageAddItem() on the page. If you crash after that
but before the page gets written to the disk, during crash recovery, AFAICS
PageAddItem() will complain with will not overwrite a used ItemId warning
and subsequent PANIC of the recovery.

Thanks,
Pavan

[1] http://archives.postgresql.org/pgsql-hackers/2011-07/msg00624.php
[2]
http://archives.postgresql.org/message-id/caboikdphax5ugugb9rjnsj+zveytv8sn4ctyfcmbc47r6_b...@mail.gmail.com


-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-08 Thread Takeshi Yamamuro

Hi,

(2013/01/07 22:36), Greg Stark wrote:

On Mon, Jan 7, 2013 at 10:21 AM, John R Piercepie...@hogranch.com  wrote:

On 1/7/2013 2:05 AM, Andres Freund wrote:


I think there should be enough bits available in the toast pointer to
indicate the type of compression. I seem to remember somebody even
posting a patch to that effect?
I agree that it's probably too late in the 9.3 cycle to start with this.



so an upgraded database would have old toasted values in the old compression
format, and new toasted values in the new format in an existing table?
that's kind of ugly.


I haven't looked at the patch. It's not obvious to me from the
description that the output isn't backwards compatible. The way the LZ
toast compression works the output is self-describing. There are many
different outputs that would decompress to the same thing and the
compressing code can choose how hard to look for earlier matches and
when to just copy bytes wholesale but the decompression will work
regardless.


My patch is not backwards compatible, so we need some features
to switch these old and new disk formats.

I think the discussion below is helpful in this use.
That is, PGLZ_Header is used as this purpose.
http://archives.postgresql.org/pgsql-hackers/2012-03/msg00971.php

regards,
--

Takeshi Yamamuro
NTT Cyber Communications Laboratory Group
Software Innovation Center
(Open Source Software Center)
Tel: +81-3-5860-5057 Fax: +81-3-5463-5490
Mail:yamamuro.take...@lab.ntt.co.jp


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/Python result object str handler

2013-01-08 Thread Magnus Hagander
On Tue, Jan 8, 2013 at 3:58 AM, Peter Eisentraut pete...@gmx.net wrote:
 For debugging PL/Python functions, I'm often tempted to write something
 like

 rv = plpy.execute(...)
 plpy.info(rv)

 which prints something unhelpful like

 PLyResult object at 0xb461d8d8

 By implementing a str handler for the result object, it now prints
 something like

 PLyResult status=5 nrows=2 rows=[{'foo': 1, 'bar': '11'}, {'foo': 2, 'bar': 
 '22'}]

 Patch attached for review.

How does it work if there are many rows in there? Say the result
contains 10,000 rows - will the string contain all of them? If so,
might it be worthwhile to cap the number of rows shown and then follow
with a ... or something?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-08 Thread Hannu Krosing

On 01/08/2013 10:19 AM, Takeshi Yamamuro wrote:

Hi,

(2013/01/07 22:36), Greg Stark wrote:
On Mon, Jan 7, 2013 at 10:21 AM, John R Piercepie...@hogranch.com  
wrote:

On 1/7/2013 2:05 AM, Andres Freund wrote:


I think there should be enough bits available in the toast pointer to
indicate the type of compression. I seem to remember somebody even
posting a patch to that effect?
I agree that it's probably too late in the 9.3 cycle to start with 
this.



so an upgraded database would have old toasted values in the old 
compression

format, and new toasted values in the new format in an existing table?
that's kind of ugly.


I haven't looked at the patch. It's not obvious to me from the
description that the output isn't backwards compatible. The way the LZ
toast compression works the output is self-describing. There are many
different outputs that would decompress to the same thing and the
compressing code can choose how hard to look for earlier matches and
when to just copy bytes wholesale but the decompression will work
regardless.


My patch is not backwards compatible, so we need some features
to switch these old and new disk formats.

Is it a feature of our compressed format that it is hard to make this
backwards compatible.

Only decompression should work anyway as we have not supported
physical compatibility in the other direction in our other tools.

That is, we don't have pg_downgrade :)

Hannu



I think the discussion below is helpful in this use.
That is, PGLZ_Header is used as this purpose.
http://archives.postgresql.org/pgsql-hackers/2012-03/msg00971.php

regards,




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recent ALTER whatever .. SET SCHEMA refactoring

2013-01-08 Thread Robert Haas
On Tue, Jan 8, 2013 at 4:05 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 Does it make sense an idea to invoke AlterFunctionNamespace_oid()
 or AlterCollationNamespace_oid() from AlterObjectNamespace_internal()
 for checks of namespace conflicts?
 It can handle special cases with keeping modularity between common
 and specific parts. Let's consider function pointer when we have mode
 than 5 object classes that needs special treatment.

Unless I'm gravely mistaken, we're only talking about a handful of
lines of code.  We have lots of functions, in objectaddress.c for
example, whose behavior is conditional on the type of object that they
are operating on.  And we just write out all the cases.  I'm not
understanding why we need to take a substantially more complex
approach here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-08 Thread Robert Haas
On Tue, Jan 8, 2013 at 4:04 AM, Takeshi Yamamuro
yamamuro.take...@lab.ntt.co.jp wrote:
 Apart from my patch, what I care is that the current one might
 be much slow against I/O. For example, when compressing
 and writing large values, compressing data (20-40MiB/s) might be
 a dragger against writing data in disks (50-80MiB/s). Moreover,
 IMHO modern (and very fast) I/O subsystems such as SSD make a
 bigger issue in this case.

What about just turning compression off?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposal: Store timestamptz of database creation on pg_database

2013-01-08 Thread Robert Haas
On Fri, Jan 4, 2013 at 1:07 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 1/3/13 3:26 PM, Robert Haas wrote:
 It's true, as we've often
 said here, that leveraging the OS facilities means that we get the
 benefit of improving OS facilities for free - but it also means that
 we never exceed what the OS facilities are able to provide.

 And that should be the deciding factor, shouldn't it?  Clearly, the OS
 timestamps do not satisfy the requirements of tracking database object
 creation times.

Yes, I think so.

But I am not entirely sold on tracking the creation time of every SQL
object.  It might be all right, but what about catalog bloat?

I am on board for databases, and for tables, at any rate.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recent ALTER whatever .. SET SCHEMA refactoring

2013-01-08 Thread Kohei KaiGai
2013/1/8 Robert Haas robertmh...@gmail.com:
 On Tue, Jan 8, 2013 at 4:05 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 Does it make sense an idea to invoke AlterFunctionNamespace_oid()
 or AlterCollationNamespace_oid() from AlterObjectNamespace_internal()
 for checks of namespace conflicts?
 It can handle special cases with keeping modularity between common
 and specific parts. Let's consider function pointer when we have mode
 than 5 object classes that needs special treatment.

 Unless I'm gravely mistaken, we're only talking about a handful of
 lines of code.  We have lots of functions, in objectaddress.c for
 example, whose behavior is conditional on the type of object that they
 are operating on.  And we just write out all the cases.  I'm not
 understanding why we need to take a substantially more complex
 approach here.

I'm probably saying same idea. It just adds invocation of external
functions to check naming conflicts of functions or collation; that
takes additional 4-lines for special case handling
in AlterObjectNamespace_internal().
Do you have different image for the special case handling?

@@ -380,6 +368,10 @@ AlterObjectNamespace_internal(Relation rel, Oid
objid, Oid nspOid)
 errmsg(%s already exists in schema \%s\,
getObjectDescriptionOids(classId, objid),
get_namespace_name(nspOid;
+   else if (classId == ProcedureRelationId)
+   AlterFunctionNamespace_oid(rel, objid, nspOid);
+   else if (classId == CollationRelationId)
+   AlterCollationNamespace_oid(rel, objid, nspOid);

/* Build modified tuple */
values = palloc0(RelationGetNumberOfAttributes(rel) * sizeof(Datum));

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


pgsql-v9.3-alter-namespace-specials.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extra XLOG in Checkpoint for StandbySnapshot

2013-01-08 Thread Amit Kapila
On Monday, January 07, 2013 7:15 PM Andres Freund wrote:
 On 2013-01-07 19:03:35 +0530, Amit Kapila wrote:
  On Monday, January 07, 2013 6:30 PM Simon Riggs wrote:
   On 7 January 2013 12:39, Amit Kapila amit.kap...@huawei.com
 wrote:
  
So We can modify to change this in function LogStandbySnapshot as
   below:
running = GetRunningTransactionData();
if (running-xcnt  0)
LogCurrentRunningXacts(running);
   
So this check will make sure that if there is no operation
 happening
   i.e. no
new running transaction, then no need to log running transaction
   snapshot
and hence further checkpoint operations will be skipped.
   
Let me know if I am missing something?
  
   It's not the same test. The fact that nothing is running at that
   moment is not the same thing as saying nothing at all has run since
   last checkpoint.
 
  But isn't the functionality of LogStandbySnapshot() is to log all
 running
  xids and all current
  AccessExclusiveLocks. For RunningTransactionLocks, WAL is avoided in
  similar way.
 
 The information that no transactions are currently running allows you
 to
 build a recovery snapshot, without that information the standby won't
 start answering queries. Now that doesn't matter if all standbys
 already
 have built a snapshot, but the primary cannot know that.

Can't we make sure that checkpoint operation doesn't happen for below conds.
a. nothing has happened during or after last checkpoint 
OR
b. nothing except snapshotstanby WAL has happened

Currently it is done for point a.

 Having to issue a checkpoint while ensuring transactions are running
 just to get a standby up doesn't seem like a good idea to me :)

Simon:
 If you make the correct test, I'd be more inclined to accept the premise.

Not sure, what exact you are expecting from test?
The test is do any one operation on system and then keep the system idle. 
Now at each checkpoint interval, it logs WAL for SnapshotStandby.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extra XLOG in Checkpoint for StandbySnapshot

2013-01-08 Thread Andres Freund
On 2013-01-08 19:51:39 +0530, Amit Kapila wrote:
 On Monday, January 07, 2013 7:15 PM Andres Freund wrote:
  On 2013-01-07 19:03:35 +0530, Amit Kapila wrote:
   On Monday, January 07, 2013 6:30 PM Simon Riggs wrote:
On 7 January 2013 12:39, Amit Kapila amit.kap...@huawei.com
  wrote:
   
 So We can modify to change this in function LogStandbySnapshot as
below:
 running = GetRunningTransactionData();
 if (running-xcnt  0)
 LogCurrentRunningXacts(running);

 So this check will make sure that if there is no operation
  happening
i.e. no
 new running transaction, then no need to log running transaction
snapshot
 and hence further checkpoint operations will be skipped.

 Let me know if I am missing something?
   
It's not the same test. The fact that nothing is running at that
moment is not the same thing as saying nothing at all has run since
last checkpoint.
  
   But isn't the functionality of LogStandbySnapshot() is to log all
  running
   xids and all current
   AccessExclusiveLocks. For RunningTransactionLocks, WAL is avoided in
   similar way.
 
  The information that no transactions are currently running allows you
  to
  build a recovery snapshot, without that information the standby won't
  start answering queries. Now that doesn't matter if all standbys
  already
  have built a snapshot, but the primary cannot know that.

 Can't we make sure that checkpoint operation doesn't happen for below conds.
 a. nothing has happened during or after last checkpoint
 OR
 b. nothing except snapshotstanby WAL has happened

 Currently it is done for point a.

  Having to issue a checkpoint while ensuring transactions are running
  just to get a standby up doesn't seem like a good idea to me :)

 Simon:
  If you make the correct test, I'd be more inclined to accept the premise.

 Not sure, what exact you are expecting from test?
 The test is do any one operation on system and then keep the system idle.
 Now at each checkpoint interval, it logs WAL for SnapshotStandby.

I can't really follow what you want to do here. The snapshot is only
logged if a checkpoint is performed anyway? As recovery starts at (the
logical) checkpoint's location we need to log a snapshot exactly
there. If you want to avoid activity when the system is idle you need to
prevent checkpoints from occurring itself. There was a thread some time
back about that and its not as trivial as it seems at the first glance.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-08 Thread Claudio Freire
On Tue, Jan 8, 2013 at 10:20 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jan 8, 2013 at 4:04 AM, Takeshi Yamamuro
 yamamuro.take...@lab.ntt.co.jp wrote:
 Apart from my patch, what I care is that the current one might
 be much slow against I/O. For example, when compressing
 and writing large values, compressing data (20-40MiB/s) might be
 a dragger against writing data in disks (50-80MiB/s). Moreover,
 IMHO modern (and very fast) I/O subsystems such as SSD make a
 bigger issue in this case.

 What about just turning compression off?

I've been relying on compression for some big serialized blob fields
for some time now. I bet I'm not alone, lots of people save serialized
data to text fields. So rather than removing it, I'd just change the
default to off (if that was the decision).

However, it might be best to evaluate some of the modern fast
compression schemes like snappy/lz4 (250MB/s per core sounds pretty
good), and implement pluggable compression schemes instead. Snappy
wasn't designed for nothing, it was most likely because it was
necessary. Cassandra (just to name a system I'm familiar with) started
without compression, and then it was deemed necessary to the point
they invested considerable time into it. I've always found the fact
that pg does compression of toast tables quite forward-thinking, and
I'd say the feature has to remain there, extended and modernized,
maybe off by default, but there.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-01-08 Thread Claudio Freire
On Tue, Jan 8, 2013 at 11:39 AM, Merlin Moncure mmonc...@gmail.com wrote:
 Reference: 
 http://postgresql.1045698.n5.nabble.com/Simple-join-doesn-t-use-index-td5738689.html

 This is a pretty common gotcha: user sets shared_buffers but misses
 the esoteric but important effective_cache_size.  ISTM
 effective_cache_size should always be = shared buffers -- this is a
 soft configuration error that could be reported as a warning and
 perhaps overridden on the fly.

Not true. If there are many concurrent users running concurrent
queries against parallel databases, such as some test systems I have
that contain many databases for many test environments, such a setting
wouldn't make sense. If a DBA sets it to lower than shared_buffers,
that setting has to be honored.

Rather, I'd propose the default setting should be -1 or something
default and automagic that works most of the time (but not all).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] json api WIP patch

2013-01-08 Thread Andrew Dunstan


On 01/08/2013 01:45 AM, james wrote:
The processing functions have been extended to provide 
populate_record() and populate_recordset() functions.The latter in 
particular could be useful in decomposing a piece of json 
representing an array of flat objects (a fairly common pattern) into 
a set of Postgres records in a single pass.


So this would allow an 'insert into ... select ... from 
unpack-the-JSON(...)'?


Yes.



I had been wondering how to do such an insertion efficiently in the 
context of SPI, but it seems that there is no SPI_copy equiv that 
would allow a query parse and plan to be avoided.


Your query above would need to be planned too, although the plan will be 
trivial.




Is this mechanism likely to be as fast as we can get at the moment in 
contexts where copy is not feasible?




You should not try to use it as a general bulk load facility. And it 
will not be as fast as COPY for several reasons, including that the Json 
parsing routines are necessarily much heavier than the COPY parse 
routines, which have in any case been optimized over quite a long 
period. Also, a single json datum is limited to no more than 1Gb. If you 
have such a datum, parsing it involves having it in memory and then 
taking a copy (I wonder if we could avoid that step - will take a look). 
Then each object is decomposed into a hash table of key value pairs, 
which it then used to construct the record datum. Each field name  in 
the result record is used to look up the value in the hash table - this 
happens once in the case of populate_record() and once per object in the 
array in the case of populate_recordset(). In the latter case the 
resulting records are put into a tuplestore structure (which spills to 
disk if necessary) which is then returned to the caller when all the 
objects in the json array are processed. COPY doesn't have these sorts 
of issues. It knows without having to look things up where each datum is 
in each record, and it stashes the result straight into the target 
table. It can read and insert huge numbers of rows without significant 
memory implications.


Both these routines and COPY in non-binary mode use the data type input 
routines to convert text values. In some cases (very notably timestamps) 
these routines can easily be shown to be fantastically expensive 
compared to binary input. This is part of what has led to the creation 
of utilities like pg_bulkload.


Perhaps if you give us a higher level view of what you're trying to 
achieve we can help you better.


cheers

andrew





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extra XLOG in Checkpoint for StandbySnapshot

2013-01-08 Thread Amit Kapila
On Tuesday, January 08, 2013 8:01 PM Andres Freund wrote:
 On 2013-01-08 19:51:39 +0530, Amit Kapila wrote:
  On Monday, January 07, 2013 7:15 PM Andres Freund wrote:
   On 2013-01-07 19:03:35 +0530, Amit Kapila wrote:
On Monday, January 07, 2013 6:30 PM Simon Riggs wrote:
 On 7 January 2013 12:39, Amit Kapila amit.kap...@huawei.com
   wrote:

  So We can modify to change this in function
 LogStandbySnapshot as
 below:
  running = GetRunningTransactionData();
  if (running-xcnt  0)
  LogCurrentRunningXacts(running);
 
  So this check will make sure that if there is no operation
   happening
 i.e. no
  new running transaction, then no need to log running
 transaction
 snapshot
  and hence further checkpoint operations will be skipped.
 
  Let me know if I am missing something?

 It's not the same test. The fact that nothing is running at
 that
 moment is not the same thing as saying nothing at all has run
 since
 last checkpoint.
   
But isn't the functionality of LogStandbySnapshot() is to log
 all
   running
xids and all current
AccessExclusiveLocks. For RunningTransactionLocks, WAL is
 avoided in
similar way.
  
   The information that no transactions are currently running allows
 you
   to
   build a recovery snapshot, without that information the standby
 won't
   start answering queries. Now that doesn't matter if all standbys
   already
   have built a snapshot, but the primary cannot know that.
 
  Can't we make sure that checkpoint operation doesn't happen for below
 conds.
  a. nothing has happened during or after last checkpoint
  OR
  b. nothing except snapshotstanby WAL has happened
 
  Currently it is done for point a.
 
   Having to issue a checkpoint while ensuring transactions are
 running
   just to get a standby up doesn't seem like a good idea to me :)
 
  Simon:
   If you make the correct test, I'd be more inclined to accept the
 premise.
 
  Not sure, what exact you are expecting from test?
  The test is do any one operation on system and then keep the system
 idle.
  Now at each checkpoint interval, it logs WAL for SnapshotStandby.
 
 I can't really follow what you want to do here. The snapshot is only
 logged if a checkpoint is performed anyway?  As recovery starts at (the
 logical) checkpoint's location we need to log a snapshot exactly
 there. If you want to avoid activity when the system is idle you need
 to
 prevent checkpoints from occurring itself.

Even if the checkpoint is scheduled, it doesn't perform actual operation if
there's nothing logged between
current and previous checkpoint due to below check in CreateCheckPoint()
function.
if (curInsert == ControlFile-checkPoint + 
MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint))  
ControlFile-checkPoint ==
ControlFile-checkPointCopy.redo)

But if we set the wal_level as hot_standby, it will log snapshot, now next
time again when function CreateCheckPoint()
will get called due to scheduled checkpoint, the above check will fail and
it will again log snapshot, so this will continue, even if the system is
totally idle.
I understand that it doesn't cause any problem, but I think it is better if
the repeated log of snapshot in this scenario can be avoided.

 There was a thread some time
 back about that and its not as trivial as it seems at the first glance.

I know some part of it that it has been fixed to avoid checkpoint operation
for low activity system and later on that change is rolledback due to
another problem, but I am not sure if it has been agreed that we don't need
to do anything for the above scenario.


With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposal: Store timestamptz of database creation on pg_database

2013-01-08 Thread Robert Haas
On Sat, Jan 5, 2013 at 11:04 AM, Stephen Frost sfr...@snowman.net wrote:
 * Fabrízio de Royes Mello (fabriziome...@gmail.com) wrote:
 * also we discuss about create two new catalogs, one local and another
 shared (like pg_description and pg_shdescription) to track creation times
 of all database objects.

 Creating a separate catalog (or two) every time we want to track XYZ for
 all objects is rather overkill...  Thinking about this a bit more, and
 noting that pg_description/shdescription more-or-less already exist as a
 framework for tracking 'something' for 'all catalog entries'- why don't
 we just add these columns to those tables..?  This would also address
 Peter's concern about making sure we do this 'wholesale' and in one
 release rather than spread across multiple releases- just make sure it
 covers the same set of things which 'comment' does.

I suspect that trying to shoehorn this into
pg_description/pg_shdescription will contort both features
unnecessarily, but I'm willing to be proven wrong.

 Also, I don't think we really need a GUC for this.

Indeed, a GUC would seem to me to defeat the entire point of the feature.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extra XLOG in Checkpoint for StandbySnapshot

2013-01-08 Thread Andres Freund
On 2013-01-08 20:33:28 +0530, Amit Kapila wrote:
 On Tuesday, January 08, 2013 8:01 PM Andres Freund wrote:
  On 2013-01-08 19:51:39 +0530, Amit Kapila wrote:
   On Monday, January 07, 2013 7:15 PM Andres Freund wrote:
On 2013-01-07 19:03:35 +0530, Amit Kapila wrote:
 On Monday, January 07, 2013 6:30 PM Simon Riggs wrote:
  On 7 January 2013 12:39, Amit Kapila amit.kap...@huawei.com
wrote:
 
   So We can modify to change this in function
  LogStandbySnapshot as
  below:
   running = GetRunningTransactionData();
   if (running-xcnt  0)
   LogCurrentRunningXacts(running);
  
   So this check will make sure that if there is no operation
happening
  i.e. no
   new running transaction, then no need to log running
  transaction
  snapshot
   and hence further checkpoint operations will be skipped.
  
   Let me know if I am missing something?
 
  It's not the same test. The fact that nothing is running at
  that
  moment is not the same thing as saying nothing at all has run
  since
  last checkpoint.

 But isn't the functionality of LogStandbySnapshot() is to log
  all
running
 xids and all current
 AccessExclusiveLocks. For RunningTransactionLocks, WAL is
  avoided in
 similar way.
   
The information that no transactions are currently running allows
  you
to
build a recovery snapshot, without that information the standby
  won't
start answering queries. Now that doesn't matter if all standbys
already
have built a snapshot, but the primary cannot know that.
  
   Can't we make sure that checkpoint operation doesn't happen for below
  conds.
   a. nothing has happened during or after last checkpoint
   OR
   b. nothing except snapshotstanby WAL has happened
  
   Currently it is done for point a.
  
Having to issue a checkpoint while ensuring transactions are
  running
just to get a standby up doesn't seem like a good idea to me :)
  
   Simon:
If you make the correct test, I'd be more inclined to accept the
  premise.
  
   Not sure, what exact you are expecting from test?
   The test is do any one operation on system and then keep the system
  idle.
   Now at each checkpoint interval, it logs WAL for SnapshotStandby.
 
  I can't really follow what you want to do here. The snapshot is only
  logged if a checkpoint is performed anyway?  As recovery starts at (the
  logical) checkpoint's location we need to log a snapshot exactly
  there. If you want to avoid activity when the system is idle you need
  to
  prevent checkpoints from occurring itself.

 Even if the checkpoint is scheduled, it doesn't perform actual operation if
 there's nothing logged between
 current and previous checkpoint due to below check in CreateCheckPoint()
 function.
 if (curInsert == ControlFile-checkPoint +
 MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint)) 
 ControlFile-checkPoint ==
 ControlFile-checkPointCopy.redo)

 But if we set the wal_level as hot_standby, it will log snapshot, now next
 time again when function CreateCheckPoint()
 will get called due to scheduled checkpoint, the above check will fail and
 it will again log snapshot, so this will continue, even if the system is
 totally idle.
 I understand that it doesn't cause any problem, but I think it is better if
 the repeated log of snapshot in this scenario can be avoided.

ISTM in that case you just need a way to cope with the additionally
logged record in the above piece of code. Not logging seems to be the
entirely wrong way to go at this.
I admit its not totally simple, but making HS less predictable seems
like a cure *far* worse than the disease.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Weird Assert failure in GetLockStatusData()

2013-01-08 Thread Tom Lane
This is a bit disturbing:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bushpigdt=2013-01-07%2019%3A15%3A02

The key bit is

[50eb2156.651e:6] LOG:  execute isolationtester_waiting: SELECT 1 FROM pg_locks 
holder, pg_locks waiter WHERE NOT waiter.granted AND waiter.pid = $1 AND 
holder.granted AND holder.pid  $1 AND holder.pid IN (25887, 25888, 25889) AND 
holder.mode = ANY (CASE waiter.mode WHEN 'AccessShareLock' THEN 
ARRAY['AccessExclusiveLock'] WHEN 'RowShareLock' THEN 
ARRAY['ExclusiveLock','AccessExclusiveLock'] WHEN 'RowExclusiveLock' THEN 
ARRAY['ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']
 WHEN 'ShareUpdateExclusiveLock' THEN 
ARRAY['ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']
 WHEN 'ShareLock' THEN 
ARRAY['RowExclusiveLock','ShareUpdateExclusiveLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']
 WHEN 'ShareRowExclusiveLock' THEN 
ARRAY['RowExclusiveLock','ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']
 WHEN 'ExclusiveLock' THEN ARRAY['RowShar!
 
eLock','RowExclusiveLock','ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']
 WHEN 'AccessExclusiveLock' THEN 
ARRAY['AccessShareLock','RowShareLock','RowExclusiveLock','ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']
 END) AND holder.locktype IS NOT DISTINCT FROM waiter.locktype AND 
holder.database IS NOT DISTINCT FROM waiter.database AND holder.relation IS NOT 
DISTINCT FROM waiter.relation AND holder.page IS NOT DISTINCT FROM waiter.page 
AND holder.tuple IS NOT DISTINCT FROM waiter.tuple AND holder.virtualxid IS NOT 
DISTINCT FROM waiter.virtualxid AND holder.transactionid IS NOT DISTINCT FROM 
waiter.transactionid AND holder.classid IS NOT DISTINCT FROM waiter.classid AND 
holder.objid IS NOT DISTINCT FROM waiter.objid AND holder.objsubid IS NOT 
DISTINCT FROM waiter.objsubid 
[50eb2156.651e:7] DETAIL:  parameters: $1 = '25889'
TRAP: FailedAssertion(!(el == data-nelements), File: lock.c, Line: 3398)
[50eb2103.62ee:2] LOG:  server process (PID 25886) was terminated by signal 6: 
Aborted
[50eb2103.62ee:3] DETAIL:  Failed process was running: SELECT 1 FROM pg_locks 
holder, pg_locks waiter WHERE NOT waiter.granted AND waiter.pid = $1 AND 
holder.granted AND holder.pid  $1 AND holder.pid IN (25887, 25888, 25889) AND 
holder.mode = ANY (CASE waiter.mode WHEN 'AccessShareLock' THEN 
ARRAY['AccessExclusiveLock'] WHEN 'RowShareLock' THEN 
ARRAY['ExclusiveLock','AccessExclusiveLock'] WHEN 'RowExclusiveLock' THEN 
ARRAY['ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']
 WHEN 'ShareUpdateExclusiveLock' THEN 
ARRAY['ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']
 WHEN 'ShareLock' THEN 
ARRAY['RowExclusiveLock','ShareUpdateExclusiveLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']
 WHEN 'ShareRowExclusiveLock' THEN 
ARRAY['RowExclusiveLock','ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']
 WHEN 'ExclusiveLock' THEN ARRAY['RowShareL!
 
ock','RowExclusiveLock','ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','E

The assertion failure seems to indicate that the number of
LockMethodProcLockHash entries found by hash_seq_search didn't match the
number that had been counted by hash_get_num_entries immediately before
that.  I don't see any bug in GetLockStatusData itself, so this suggests
that there's something wrong with dynahash's entry counting, or that
somebody somewhere is modifying the shared hash table without holding
the appropriate lock.  The latter seems a bit more likely, given that
this must be a very low-probability bug or we'd have seen it before.
An overlooked locking requirement in a seldom-taken code path would fit
the symptoms.

Or maybe bushpig just had some weird cosmic-ray hardware failure,
but I don't put a lot of faith in such explanations.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/Python result object str handler

2013-01-08 Thread Daniele Varrazzo
On Tue, Jan 8, 2013 at 9:32 AM, Magnus Hagander mag...@hagander.net wrote:
 On Tue, Jan 8, 2013 at 3:58 AM, Peter Eisentraut pete...@gmx.net wrote:
 For debugging PL/Python functions, I'm often tempted to write something
 like

 rv = plpy.execute(...)
 plpy.info(rv)

 which prints something unhelpful like

 PLyResult object at 0xb461d8d8

 By implementing a str handler for the result object, it now prints
 something like

 PLyResult status=5 nrows=2 rows=[{'foo': 1, 'bar': '11'}, {'foo': 2, 'bar': 
 '22'}]

This looks more a repr-style format to me (if you implement repr but
not str, the latter will default to the former).


 Patch attached for review.

 How does it work if there are many rows in there? Say the result
 contains 10,000 rows - will the string contain all of them? If so,
 might it be worthwhile to cap the number of rows shown and then follow
 with a ... or something?

I think it would: old django versions were a pain in the neck because
when a page broke an entire dump of gigantic queries was often dumped
as debug info.

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_upgrade regression test litters the source tree with log files

2013-01-08 Thread Tom Lane
In a tree in which I previously ran make check in contrib/pg_upgrade:

$ make -s distclean
$ git status
# On branch master
# Untracked files:
#   (use git add file... to include in what will be committed)
#
#   contrib/pg_upgrade/pg_upgrade_dump_1.log
#   contrib/pg_upgrade/pg_upgrade_dump_12912.log
#   contrib/pg_upgrade/pg_upgrade_dump_16384.log
nothing added to commit but untracked files present (use git add to track)

Not sure how long this has been happening.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cascading replication: should we detect/prevent cycles?

2013-01-08 Thread Josh Berkus

On 1/5/13 1:21 PM, Peter Geoghegan wrote:

On 21 December 2012 14:08, Robert Haas robertmh...@gmail.com wrote:

I'm sure it's possible; I don't *think* it's terribly easy.


I'm inclined to agree that this isn't a terribly pressing issue.
Certainly, the need to introduce a bunch of new infrastructure to
detect this case seems hard to justify.


Impossible to justify, I'd say.

Does anyone have any objections to my adding this to the TODO list, in 
case some clever GSOC student comes up with a way to do it *without* 
adding a bunch of infrastructure?


--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] xlogreader-v4

2013-01-08 Thread Andres Freund
From: Andres Freund and...@2ndquadrant.com
Subject: [PATCH] xlogreader-v4
In-Reply-To: 

Hi,

this is the latest and obviously best version of xlogreader  xlogdump with
changes both from Heikki and me.

Changes:
* windows build support for pg_xlogdump
* xlogdump moved to contrib
* xlogdump option parsing enhancements
* generic cleanups
* a few more comments
* removal of some ugliness in XLogFindNextRecord

I think its mostly ready, for xlogdump minimally these two issues remain:

const char *
timestamptz_to_str(TimestampTz dt)
{
return unimplemented-timestamp;
}

const char *
relpathbackend(RelFileNode rnode, BackendId backend, ForkNumber forknum)
{
return unimplemented-relpathbackend;
}

aren't exactly the nicest wrapper functions. I think its ok to simply copy
relpathbackend from the backend, but timestamptz_to_str? Thats a heck of a lot
of code.

Patches 1 and 2 and 5 are just preparatory and probably can be applied
beforehand.

3 and 4 are the real meat of this and especially 3 needs some careful review.

Input welcome!

Andres



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH 1/5] Centralize Assert* macros into c.h so its common between backend/frontend

2013-01-08 Thread Andres Freund
From: Andres Freund and...@anarazel.de

c.h already had parts of the assert support (StaticAssert*) and its the shared
file between postgres.h and postgres_fe.h. This makes it easier to build
frontend programs which have to do the hack.
---
 src/include/c.h   | 65 +++
 src/include/postgres.h| 54 ++-
 src/include/postgres_fe.h | 12 -
 3 files changed, 67 insertions(+), 64 deletions(-)

diff --git a/src/include/c.h b/src/include/c.h
index f7db157..c30df8b 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -694,6 +694,71 @@ typedef NameData *Name;
 
 
 /*
+ * USE_ASSERT_CHECKING, if defined, turns on all the assertions.
+ * - plai  9/5/90
+ *
+ * It should _NOT_ be defined in releases or in benchmark copies
+ */
+
+/*
+ * Assert() can be used in both frontend and backend code. In frontend code it
+ * just calls the standard assert, if it's available. If use of assertions is
+ * not configured, it does nothing.
+ */
+#ifndef USE_ASSERT_CHECKING
+
+#define Assert(condition)
+#define AssertMacro(condition) ((void)true)
+#define AssertArg(condition)
+#define AssertState(condition)
+
+#elif defined FRONTEND
+
+#include assert.h
+#define Assert(p) assert(p)
+#define AssertMacro(p) ((void) assert(p))
+
+#else /* USE_ASSERT_CHECKING  FRONTEND */
+
+/*
+ * Trap
+ * Generates an exception if the given condition is true.
+ */
+#define Trap(condition, errorType) \
+   do { \
+   if ((assert_enabled)  (condition)) \
+   ExceptionalCondition(CppAsString(condition), 
(errorType), \
+__FILE__, 
__LINE__); \
+   } while (0)
+
+/*
+ * TrapMacro is the same as Trap but it's intended for use in macros:
+ *
+ * #define foo(x) (AssertMacro(x != 0), bar(x))
+ *
+ * Isn't CPP fun?
+ */
+#define TrapMacro(condition, errorType) \
+   ((bool) ((! assert_enabled) || ! (condition) || \
+(ExceptionalCondition(CppAsString(condition), 
(errorType), \
+  __FILE__, 
__LINE__), 0)))
+
+#define Assert(condition) \
+   Trap(!(condition), FailedAssertion)
+
+#define AssertMacro(condition) \
+   ((void) TrapMacro(!(condition), FailedAssertion))
+
+#define AssertArg(condition) \
+   Trap(!(condition), BadArgument)
+
+#define AssertState(condition) \
+   Trap(!(condition), BadState)
+
+#endif /* USE_ASSERT_CHECKING  !FRONTEND */
+
+
+/*
  * Macros to support compile-time assertion checks.
  *
  * If the condition (a compile-time-constant expression) evaluates to false,
diff --git a/src/include/postgres.h b/src/include/postgres.h
index b6e922f..bbe125a 100644
--- a/src/include/postgres.h
+++ b/src/include/postgres.h
@@ -25,7 +25,7 @@
  *   ---   
  * 1)  variable-length datatypes (TOAST support)
  * 2)  datum type + support macros
- * 3)  exception handling definitions
+ * 3)  exception handling
  *
  *  NOTES
  *
@@ -627,62 +627,12 @@ extern Datum Float8GetDatum(float8 X);
 
 
 /* 
- * Section 3:  exception handling definitions
- * Assert, Trap, etc macros
+ * Section 3:  exception handling backend 
support
  * 
  */
 
 extern PGDLLIMPORT bool assert_enabled;
 
-/*
- * USE_ASSERT_CHECKING, if defined, turns on all the assertions.
- * - plai  9/5/90
- *
- * It should _NOT_ be defined in releases or in benchmark copies
- */
-
-/*
- * Trap
- * Generates an exception if the given condition is true.
- */
-#define Trap(condition, errorType) \
-   do { \
-   if ((assert_enabled)  (condition)) \
-   ExceptionalCondition(CppAsString(condition), 
(errorType), \
-__FILE__, 
__LINE__); \
-   } while (0)
-
-/*
- * TrapMacro is the same as Trap but it's intended for use in macros:
- *
- * #define foo(x) (AssertMacro(x != 0), bar(x))
- *
- * Isn't CPP fun?
- */
-#define TrapMacro(condition, errorType) \
-   ((bool) ((! assert_enabled) || ! (condition) || \
-(ExceptionalCondition(CppAsString(condition), 
(errorType), \
-  __FILE__, 
__LINE__), 0)))
-
-#ifndef USE_ASSERT_CHECKING
-#define Assert(condition)
-#define AssertMacro(condition) ((void)true)
-#define AssertArg(condition)
-#define AssertState(condition)
-#else
-#define Assert(condition) \
-   

[HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Andres Freund
From: Andres Freund and...@anarazel.de

relpathbackend() (via some of its wrappers) is used in *_desc routines which we
want to be useable without a backend environment arround.

Change signature to return a 'const char *' to make misuse easier to
detect. That necessicates also changing the 'FileName' typedef to 'const char
*' which seems to be a good idea anyway.
---
 src/backend/access/rmgrdesc/smgrdesc.c |  6 ++---
 src/backend/access/rmgrdesc/xactdesc.c |  6 ++---
 src/backend/access/transam/xlogutils.c |  9 +++
 src/backend/catalog/catalog.c  | 49 +++---
 src/backend/storage/buffer/bufmgr.c| 12 +++--
 src/backend/storage/file/fd.c  |  6 ++---
 src/backend/storage/smgr/md.c  | 23 +---
 src/backend/utils/adt/dbsize.c |  4 +--
 src/include/catalog/catalog.h  |  2 +-
 src/include/storage/fd.h   |  9 +++
 10 files changed, 42 insertions(+), 84 deletions(-)

diff --git a/src/backend/access/rmgrdesc/smgrdesc.c 
b/src/backend/access/rmgrdesc/smgrdesc.c
index bcabf89..490c8c7 100644
--- a/src/backend/access/rmgrdesc/smgrdesc.c
+++ b/src/backend/access/rmgrdesc/smgrdesc.c
@@ -26,19 +26,17 @@ smgr_desc(StringInfo buf, uint8 xl_info, char *rec)
if (info == XLOG_SMGR_CREATE)
{
xl_smgr_create *xlrec = (xl_smgr_create *) rec;
-   char   *path = relpathperm(xlrec-rnode, xlrec-forkNum);
+   const char *path = relpathperm(xlrec-rnode, xlrec-forkNum);
 
appendStringInfo(buf, file create: %s, path);
-   pfree(path);
}
else if (info == XLOG_SMGR_TRUNCATE)
{
xl_smgr_truncate *xlrec = (xl_smgr_truncate *) rec;
-   char   *path = relpathperm(xlrec-rnode, MAIN_FORKNUM);
+   const char *path = relpathperm(xlrec-rnode, MAIN_FORKNUM);
 
appendStringInfo(buf, file truncate: %s to %u blocks, path,
 xlrec-blkno);
-   pfree(path);
}
else
appendStringInfo(buf, UNKNOWN);
diff --git a/src/backend/access/rmgrdesc/xactdesc.c 
b/src/backend/access/rmgrdesc/xactdesc.c
index 2471279..b86a53e 100644
--- a/src/backend/access/rmgrdesc/xactdesc.c
+++ b/src/backend/access/rmgrdesc/xactdesc.c
@@ -35,10 +35,9 @@ xact_desc_commit(StringInfo buf, xl_xact_commit *xlrec)
appendStringInfo(buf, ; rels:);
for (i = 0; i  xlrec-nrels; i++)
{
-   char   *path = relpathperm(xlrec-xnodes[i], 
MAIN_FORKNUM);
+   const char *path = relpathperm(xlrec-xnodes[i], 
MAIN_FORKNUM);
 
appendStringInfo(buf,  %s, path);
-   pfree(path);
}
}
if (xlrec-nsubxacts  0)
@@ -105,10 +104,9 @@ xact_desc_abort(StringInfo buf, xl_xact_abort *xlrec)
appendStringInfo(buf, ; rels:);
for (i = 0; i  xlrec-nrels; i++)
{
-   char   *path = relpathperm(xlrec-xnodes[i], 
MAIN_FORKNUM);
+   const char *path = relpathperm(xlrec-xnodes[i], 
MAIN_FORKNUM);
 
appendStringInfo(buf,  %s, path);
-   pfree(path);
}
}
if (xlrec-nsubxacts  0)
diff --git a/src/backend/access/transam/xlogutils.c 
b/src/backend/access/transam/xlogutils.c
index f9a6e62..8266f3c 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -57,7 +57,7 @@ static void
 report_invalid_page(int elevel, RelFileNode node, ForkNumber forkno,
BlockNumber blkno, bool present)
 {
-   char   *path = relpathperm(node, forkno);
+   const char *path = relpathperm(node, forkno);
 
if (present)
elog(elevel, page %u of relation %s is uninitialized,
@@ -65,7 +65,6 @@ report_invalid_page(int elevel, RelFileNode node, ForkNumber 
forkno,
else
elog(elevel, page %u of relation %s does not exist,
 blkno, path);
-   pfree(path);
 }
 
 /* Log a reference to an invalid page */
@@ -153,11 +152,10 @@ forget_invalid_pages(RelFileNode node, ForkNumber forkno, 
BlockNumber minblkno)
{
if (log_min_messages = DEBUG2 || client_min_messages 
= DEBUG2)
{
-   char   *path = 
relpathperm(hentry-key.node, forkno);
+   const char *path = 
relpathperm(hentry-key.node, forkno);
 
elog(DEBUG2, page %u of relation %s has been 
dropped,
 hentry-key.blkno, path);
-   pfree(path);
}
 
if (hash_search(invalid_page_tab,
@@ -186,11 +184,10 @@ 

[HACKERS] [PATCH 5/5] remove spurious space in running_xact's _desc function

2013-01-08 Thread Andres Freund
From: Andres Freund and...@anarazel.de

---
 src/backend/access/rmgrdesc/standbydesc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/access/rmgrdesc/standbydesc.c 
b/src/backend/access/rmgrdesc/standbydesc.c
index c38892b..5fb6f54 100644
--- a/src/backend/access/rmgrdesc/standbydesc.c
+++ b/src/backend/access/rmgrdesc/standbydesc.c
@@ -57,7 +57,7 @@ standby_desc(StringInfo buf, uint8 xl_info, char *rec)
{
xl_running_xacts *xlrec = (xl_running_xacts *) rec;
 
-   appendStringInfo(buf,  running xacts:);
+   appendStringInfo(buf, running xacts:);
standby_desc_running_xacts(buf, xlrec);
}
else
-- 
1.7.12.289.g0ce9864.dirty



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH 4/5] Add pg_xlogdump contrib module

2013-01-08 Thread Andres Freund
From: Andres Freund and...@anarazel.de

Authors: Andres Freund, Heikki Linnakangas
---
 contrib/Makefile  |   1 +
 contrib/pg_xlogdump/Makefile  |  37 +++
 contrib/pg_xlogdump/compat.c  |  58 
 contrib/pg_xlogdump/pg_xlogdump.c | 654 ++
 contrib/pg_xlogdump/tables.c  |  78 +
 doc/src/sgml/ref/allfiles.sgml|   1 +
 doc/src/sgml/ref/pg_xlogdump.sgml |  76 +
 doc/src/sgml/reference.sgml   |   1 +
 src/backend/access/transam/rmgr.c |   1 +
 src/backend/catalog/catalog.c |   2 +
 src/tools/msvc/Mkvcbuild.pm   |  16 +-
 11 files changed, 924 insertions(+), 1 deletion(-)
 create mode 100644 contrib/pg_xlogdump/Makefile
 create mode 100644 contrib/pg_xlogdump/compat.c
 create mode 100644 contrib/pg_xlogdump/pg_xlogdump.c
 create mode 100644 contrib/pg_xlogdump/tables.c
 create mode 100644 doc/src/sgml/ref/pg_xlogdump.sgml

diff --git a/contrib/Makefile b/contrib/Makefile
index fcd7c1e..5d290b8 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -39,6 +39,7 @@ SUBDIRS = \
pg_trgm \
pg_upgrade  \
pg_upgrade_support \
+   pg_xlogdump \
pgbench \
pgcrypto\
pgrowlocks  \
diff --git a/contrib/pg_xlogdump/Makefile b/contrib/pg_xlogdump/Makefile
new file mode 100644
index 000..1adef35
--- /dev/null
+++ b/contrib/pg_xlogdump/Makefile
@@ -0,0 +1,37 @@
+# contrib/pg_xlogdump/Makefile
+
+PGFILEDESC = pg_xlogdump
+PGAPPICON=win32
+
+PROGRAM = pg_xlogdump
+OBJS =  pg_xlogdump.o compat.o tables.o xlogreader.o $(RMGRDESCOBJS) \
+   $(WIN32RES)
+
+# XXX: Perhaps this should be done by a wildcard rule so that you don't need
+# to remember to add new rmgrdesc files to this list.
+RMGRDESCSOURCES = clogdesc.c dbasedesc.c gindesc.c gistdesc.c hashdesc.c \
+   heapdesc.c mxactdesc.c nbtdesc.c relmapdesc.c seqdesc.c smgrdesc.c \
+   spgdesc.c standbydesc.c tblspcdesc.c xactdesc.c xlogdesc.c
+
+RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
+
+EXTRA_CLEAN = $(RMGRDESCSOURCES) xlogreader.c
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/pg_xlogdump
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
+
+override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+
+xlogreader.c: % : $(top_srcdir)/src/backend/access/transam/%
+   rm -f $@  $(LN_S) $ .
+
+$(RMGRDESCSOURCES): % : $(top_srcdir)/src/backend/access/rmgrdesc/%
+   rm -f $@  $(LN_S) $ .
diff --git a/contrib/pg_xlogdump/compat.c b/contrib/pg_xlogdump/compat.c
new file mode 100644
index 000..e150afb
--- /dev/null
+++ b/contrib/pg_xlogdump/compat.c
@@ -0,0 +1,58 @@
+/*-
+ *
+ * compat.c
+ * Reimplementations of various backend functions.
+ *
+ * Portions Copyright (c) 2012, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * contrib/pg_xlogdump/compat.c
+ *
+ * This file contains client-side implementations for various backend
+ * functions that the rm_desc functions in *desc.c files rely on.
+ *
+ *-
+ */
+
+/* ugly hack, same as in e.g pg_controldata */
+#define FRONTEND 1
+#include postgres.h
+
+#include catalog/catalog.h
+#include datatype/timestamp.h
+#include lib/stringinfo.h
+#include storage/relfilenode.h
+#include utils/timestamp.h
+#include utils/datetime.h
+
+const char *
+timestamptz_to_str(TimestampTz dt)
+{
+   return unimplemented-timestamp;
+}
+
+const char *
+relpathbackend(RelFileNode rnode, BackendId backend, ForkNumber forknum)
+{
+   return unimplemented-relpathbackend;
+}
+
+/*
+ * Provide a hacked up compat layer for StringInfos so xlog desc functions can
+ * be linked/called.
+ */
+void
+appendStringInfo(StringInfo str, const char *fmt, ...)
+{
+   va_list args;
+
+   va_start(args, fmt);
+   vprintf(fmt, args);
+   va_end(args);
+}
+
+void
+appendStringInfoString(StringInfo str, const char *string)
+{
+   appendStringInfo(str, %s, string);
+}
diff --git a/contrib/pg_xlogdump/pg_xlogdump.c 
b/contrib/pg_xlogdump/pg_xlogdump.c
new file mode 100644
index 000..29ee73e
--- /dev/null
+++ b/contrib/pg_xlogdump/pg_xlogdump.c
@@ -0,0 +1,654 @@
+/*-
+ *
+ * pg_xlogdump.c - decode and display WAL
+ *
+ * Copyright (c) 2012, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *   contrib/pg_xlogdump/pg_xlogdump.c
+ *-
+ */
+
+/* ugly hack, same as in e.g pg_controldata */
+#define FRONTEND 1
+#include postgres.h
+
+#include unistd.h
+
+#include access/xlog.h
+#include access/xlogreader.h
+#include 

Re: [HACKERS] [PATCH] xlogreader-v4

2013-01-08 Thread Thom Brown
On 8 January 2013 19:09, Andres Freund and...@2ndquadrant.com wrote:

 From: Andres Freund and...@2ndquadrant.com
 Subject: [PATCH] xlogreader-v4
 In-Reply-To:

 Hi,

 this is the latest and obviously best version of xlogreader  xlogdump with
 changes both from Heikki and me.


Aren't you forgetting something?

-- 
Thom


Re: [HACKERS] [PATCH] xlogreader-v4

2013-01-08 Thread Thom Brown
On 8 January 2013 19:15, Thom Brown t...@linux.com wrote:

 On 8 January 2013 19:09, Andres Freund and...@2ndquadrant.com wrote:

 From: Andres Freund and...@2ndquadrant.com
 Subject: [PATCH] xlogreader-v4
 In-Reply-To:

 Hi,

 this is the latest and obviously best version of xlogreader  xlogdump
 with
 changes both from Heikki and me.


 Aren't you forgetting something?


I see, you're posting them separately.  Nevermind.

-- 
Thom


Re: [HACKERS] [PATCH 1/5] Centralize Assert* macros into c.h so its common between backend/frontend

2013-01-08 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 From: Andres Freund and...@anarazel.de
 c.h already had parts of the assert support (StaticAssert*) and its the shared
 file between postgres.h and postgres_fe.h. This makes it easier to build
 frontend programs which have to do the hack.

This patch seems unnecessary given that we already put a version of Assert()
into postgres_fe.h.  I don't think that moving the two different
definitions into an #if block in one file is an improvement.  If that
were an improvement, we might as well move everything in both postgres.h
and postgres_fe.h into c.h with a pile of #ifs.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] xlogreader-v4

2013-01-08 Thread Andres Freund
On 2013-01-08 20:09:42 +0100, Andres Freund wrote:
 From: Andres Freund and...@2ndquadrant.com
 Subject: [PATCH] xlogreader-v4
 In-Reply-To: 
 
 Hi,
 
 this is the latest and obviously best version of xlogreader  xlogdump with
 changes both from Heikki and me.
 
 Changes:
 * windows build support for pg_xlogdump

That was done blindly, btw, so I only know it compiles, not that it
runs...

Its in git://git.postgresql.org/git/users/andresfreund/postgres.git
branch xlogreader_v4 btw.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
maxpg From: Andres Freund and...@anarazel.de
 relpathbackend() (via some of its wrappers) is used in *_desc routines which 
 we
 want to be useable without a backend environment arround.

I'm 100% unimpressed with making relpathbackend return a pointer to a
static buffer.  Who's to say whether that won't create bugs due to
overlapping usages?

 Change signature to return a 'const char *' to make misuse easier to
 detect.

That seems to create way more churn than is necessary, and it's wrong
anyway if the result is palloc'd.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [PATCH 1/5] Centralize Assert* macros into c.h so its common between backend/frontend

2013-01-08 Thread Andres Freund
On 2013-01-08 14:25:06 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  From: Andres Freund and...@anarazel.de
  c.h already had parts of the assert support (StaticAssert*) and its the 
  shared
  file between postgres.h and postgres_fe.h. This makes it easier to build
  frontend programs which have to do the hack.

 This patch seems unnecessary given that we already put a version of Assert()
 into postgres_fe.h.  I don't think that moving the two different
 definitions into an #if block in one file is an improvement.  If that
 were an improvement, we might as well move everything in both postgres.h
 and postgres_fe.h into c.h with a pile of #ifs.

The problem is that some (including existing) pieces of code need to
include postgres.h itself, those can't easily include postgres_fe.h as
well without getting into problems with redefinitions. It seems the most
consistent to move all of that into c.h, enough of the assertion stuff
is already there, I don't see an advantage of splitting it across 3
files as it currently is.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 1/5] Centralize Assert* macros into c.h so its common between backend/frontend

2013-01-08 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-01-08 14:25:06 -0500, Tom Lane wrote:
 This patch seems unnecessary given that we already put a version of Assert()
 into postgres_fe.h.

 The problem is that some (including existing) pieces of code need to
 include postgres.h itself, those can't easily include postgres_fe.h as
 well without getting into problems with redefinitions.

There is no place, anywhere, that should be including both.  So I don't
see the problem.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Andres Freund
On 2013-01-08 14:28:14 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
 maxpg From: Andres Freund and...@anarazel.de
  relpathbackend() (via some of its wrappers) is used in *_desc routines 
  which we
  want to be useable without a backend environment arround.

 I'm 100% unimpressed with making relpathbackend return a pointer to a
 static buffer.  Who's to say whether that won't create bugs due to
 overlapping usages?

I say it ;). I've gone through all callers and checked. Not that that
guarantees anything, but ...

The reason a static buffer is better is that some of the *desc routines
use relpathbackend() and pfree() the result. That would require
providing a stub pfree() in xlogdump which seems to be exceedingly ugly.

  Change signature to return a 'const char *' to make misuse easier to
  detect.

 That seems to create way more churn than is necessary, and it's wrong
 anyway if the result is palloc'd.

It causes warnings in potential external users that pfree() the result
of relpathbackend which seems helpful. Obviously only makes sense if
relpathbackend() returns a pointer into a static buffer...

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cascading replication: should we detect/prevent cycles?

2013-01-08 Thread Simon Riggs
On 8 January 2013 18:46, Josh Berkus j...@agliodbs.com wrote:
 On 1/5/13 1:21 PM, Peter Geoghegan wrote:

 On 21 December 2012 14:08, Robert Haas robertmh...@gmail.com wrote:

 I'm sure it's possible; I don't *think* it's terribly easy.


 I'm inclined to agree that this isn't a terribly pressing issue.
 Certainly, the need to introduce a bunch of new infrastructure to
 detect this case seems hard to justify.


 Impossible to justify, I'd say.

 Does anyone have any objections to my adding this to the TODO list, in case
 some clever GSOC student comes up with a way to do it *without* adding a
 bunch of infrastructure?

Daniel already did object

I think we have other problems that need solving much more than this.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-01-08 14:28:14 -0500, Tom Lane wrote:
 I'm 100% unimpressed with making relpathbackend return a pointer to a
 static buffer.  Who's to say whether that won't create bugs due to
 overlapping usages?

 I say it ;). I've gone through all callers and checked. Not that that
 guarantees anything, but ...

Even if you've proven it safe today, it seems unnecessarily fragile.
Just about any other place we've used a static result buffer, we've
regretted it, unless the use cases were *very* narrow.

 The reason a static buffer is better is that some of the *desc routines
 use relpathbackend() and pfree() the result. That would require
 providing a stub pfree() in xlogdump which seems to be exceedingly ugly.

Why?  If we have palloc support how hard can it be to map pfree to free?
And why wouldn't we want to?  I can hardly imagine providing only palloc
and not pfree support.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cascading replication: should we detect/prevent cycles?

2013-01-08 Thread David Fetter
On Tue, Jan 08, 2013 at 10:46:12AM -0800, Josh Berkus wrote:
 On 1/5/13 1:21 PM, Peter Geoghegan wrote:
 On 21 December 2012 14:08, Robert Haas robertmh...@gmail.com wrote:
 I'm sure it's possible; I don't *think* it's terribly easy.
 
 I'm inclined to agree that this isn't a terribly pressing issue.
 Certainly, the need to introduce a bunch of new infrastructure to
 detect this case seems hard to justify.
 
 Impossible to justify, I'd say.
 
 Does anyone have any objections to my adding this to the TODO list,
 in case some clever GSOC student comes up with a way to do it
 *without* adding a bunch of infrastructure?

I'm pretty sure the logical change stuff Andres et al. are working on
will be able to include the originating node, which makes cycle
detection dead simple.

Other restrictions on the graph like, must be a tree might be more
complicated.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Andres Freund
On 2013-01-08 14:53:29 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2013-01-08 14:28:14 -0500, Tom Lane wrote:
  I'm 100% unimpressed with making relpathbackend return a pointer to a
  static buffer.  Who's to say whether that won't create bugs due to
  overlapping usages?

  I say it ;). I've gone through all callers and checked. Not that that
  guarantees anything, but ...

 Even if you've proven it safe today, it seems unnecessarily fragile.
 Just about any other place we've used a static result buffer, we've
 regretted it, unless the use cases were *very* narrow.

Hm, relpathbackend seems pretty narrow to me.

Funny, we both argued the other way round than we are now when talking
about the sprintf(..., %X/%X, (uint32)(recptr  32), (uint32)recptr)
thingy ;)

  The reason a static buffer is better is that some of the *desc routines
  use relpathbackend() and pfree() the result. That would require
  providing a stub pfree() in xlogdump which seems to be exceedingly ugly.

 Why?  If we have palloc support how hard can it be to map pfree to free?
 And why wouldn't we want to?  I can hardly imagine providing only palloc
 and not pfree support.

Uhm, we don't have  need palloc support and I don't think
relpathbackend() is a good justification for adding it.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] json api WIP patch

2013-01-08 Thread james


I had been wondering how to do such an insertion efficiently in the context of 
SPI, but it seems that there is no SPI_copy equiv that would allow a query 
parse and plan to be avoided.


Your query above would need to be planned too, although the plan will be 
trivial.


Ah yes, I meant that I had not found a way to avoid it (for multi-row 
inserts etc) from a stored proc context where I have SPI functions 
available.



You should not try to use it as a general bulk load facility. And it will not 
be as fast as COPY for several reasons, including that the Json parsing 
routines are necessarily much heavier than the COPY parse routines, which have 
in any case been optimized over quite a long period. Also, a single json datum 
is limited to no more than 1Gb. If you have such a datum, parsing it involves 
having it in memory and then taking a copy (I wonder if we could avoid that 
step - will take a look). Then each object is decomposed into a hash table of 
key value pairs, which it then used to construct the record datum. Each field 
name  in the result record is used to look up the value in the hash table - 
this happens once in the case of populate_record() and once per object in the 
array in the case of populate_recordset(). In the latter case the resulting 
records are put into a tuplestore structure (which spills to disk if necessary) 
which is then returned to the caller when all the objects in

 the js
on array are processed. COPY doesn't have these sorts of issues. It knows 
without having to look things up where each datum is in each record, and it 
stashes the result straight into the target table. It can read and insert huge 
numbers of rows without significant memory implications.

Yes - but I don't think I can use COPY from a stored proc context can I? 
 If I could use binary COPY from a stored proc that has received a 
binary param and unpacked to the data, it would be handy.


If SPI provided a way to perform a copy to a temp table and then some 
callback on an iterator that yields rows to it, that would do the trick 
I guess.



Perhaps if you give us a higher level view of what you're trying to achieve we 
can help you better.


I had been trying to identify a way to work with record sets where the 
records might be used for insert, or for updates or deletion statements, 
preferably without forming a large custom SQL statement that must then 
be parsed and planned (and which would be a PITA if I wanted to use the 
SQL-C preprocessor or some language bindings that like to prepare a 
statement and execute with params).


The data I work with has a master-detail structure and insertion 
performance matters, so I'm trying to limit manipulations to one 
statement per table per logical operation even where there are multiple 
detail rows.


Sometimes the network latency can be a pain too and that also suggests 
an RPC with unpack and insert locally.


Cheers
James



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] json api WIP patch

2013-01-08 Thread Andrew Dunstan


On 01/08/2013 09:58 AM, Andrew Dunstan wrote:


If you have such a datum, parsing it involves having it in memory and 
then taking a copy (I wonder if we could avoid that step - will take a 
look).



Here is a Proof Of Concept patch against my development tip on what's 
involved in getting the JSON lexer not to need a nul-terminated string 
to parse. This passes regression, incidentally. The downside is that 
processing is very slightly more complex, and that json_in() would need 
to call strlen() on its input. The upside would be that the processing 
routines I've been working on would no longer need to create copies of 
their json arguments using text_to_cstring() just so they can get a 
null-terminated string to process.


Consequent changes would modify the signature of makeJsonLexContext() so 
it's first argument would be a text* instead of a char* (and of course 
its logic would change accordingly).


I could go either way. Thoughts?

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [PATCH 1/5] Centralize Assert* macros into c.h so its common between backend/frontend

2013-01-08 Thread Andres Freund
On 2013-01-08 14:35:12 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2013-01-08 14:25:06 -0500, Tom Lane wrote:
  This patch seems unnecessary given that we already put a version of 
  Assert()
  into postgres_fe.h.
 
  The problem is that some (including existing) pieces of code need to
  include postgres.h itself, those can't easily include postgres_fe.h as
  well without getting into problems with redefinitions.
 
 There is no place, anywhere, that should be including both.  So I don't
 see the problem.

Sorry, misremembered the problem somewhat. The problem is that code that
includes postgres.h atm ends up with ExceptionalCondition() et
al. declared even if FRONTEND is defined. So if anything uses an assert
you need to provide wrappers for those which seems nasty. If they are
provided centrally and check for FRONTEND that problem doesn't exist.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] json api WIP patch

2013-01-08 Thread Andrew Dunstan


On 01/08/2013 03:12 PM, Andrew Dunstan wrote:


On 01/08/2013 09:58 AM, Andrew Dunstan wrote:


If you have such a datum, parsing it involves having it in memory and 
then taking a copy (I wonder if we could avoid that step - will take 
a look).



Here is a Proof Of Concept patch against my development tip on what's 
involved in getting the JSON lexer not to need a nul-terminated string 
to parse. This passes regression, incidentally. The downside is that 
processing is very slightly more complex, and that json_in() would 
need to call strlen() on its input. The upside would be that the 
processing routines I've been working on would no longer need to 
create copies of their json arguments using text_to_cstring() just so 
they can get a null-terminated string to process.


Consequent changes would modify the signature of makeJsonLexContext() 
so it's first argument would be a text* instead of a char* (and of 
course its logic would change accordingly).


I could go either way. Thoughts?




this time with patch ...



*** a/src/backend/utils/adt/json.c
--- b/src/backend/utils/adt/json.c
***
*** 212,217  makeJsonLexContext(char *json, bool need_escapes)
--- 212,218 
  
  	lex-input = lex-token_terminator = lex-line_start = json;
  	lex-line_number = 1;
+ 	lex-input_length = strlen(json);
  	if (need_escapes)
  		lex-strval = makeStringInfo();
  	return lex;
***
*** 398,416  static void
  json_lex(JsonLexContext *lex)
  {
  	char	   *s;
! 
  	/* Skip leading whitespace. */
  	s = lex-token_terminator;
! 	while (*s == ' ' || *s == '\t' || *s == '\n' || *s == '\r')
  	{
  		if (*s == '\n')
  			++lex-line_number;
  		++s;
  	}
  	lex-token_start = s;
  
  	/* Determine token type. */
! 	if (*s == '\0')
  	{
  		lex-token_start = NULL;
  		lex-prev_token_terminator = lex-token_terminator;
--- 399,420 
  json_lex(JsonLexContext *lex)
  {
  	char	   *s;
! 	int len;
  	/* Skip leading whitespace. */
  	s = lex-token_terminator;
! 	len = s - lex-input;
! 	while (len  lex-input_length 
! 		   (*s == ' ' || *s == '\t' || *s == '\n' || *s == '\r'))
  	{
  		if (*s == '\n')
  			++lex-line_number;
  		++s;
+ 		++len;
  	}
  	lex-token_start = s;
  
  	/* Determine token type. */
! 	if (len = lex-input_length)
  	{
  		lex-token_start = NULL;
  		lex-prev_token_terminator = lex-token_terminator;
***
*** 476,482  json_lex(JsonLexContext *lex)
  		 * whole word as an unexpected token, rather than just some
  		 * unintuitive prefix thereof.
  		 */
! 		for (p = s; JSON_ALPHANUMERIC_CHAR(*p); p++)
  			 /* skip */ ;
  
  		/*
--- 480,486 
  		 * whole word as an unexpected token, rather than just some
  		 * unintuitive prefix thereof.
  		 */
! 		for (p = s; JSON_ALPHANUMERIC_CHAR(*p)  p - s  lex-input_length - len; p++)
  			 /* skip */ ;
  
  		/*
***
*** 519,539  static void
  json_lex_string(JsonLexContext *lex)
  {
  	char	   *s;
! 
  	if (lex-strval != NULL)
  		resetStringInfo(lex-strval);
  
! 	for (s = lex-token_start + 1; *s != ''; s++)
  	{
! 		/* Per RFC4627, these characters MUST be escaped. */
! 		if ((unsigned char) *s  32)
  		{
! 			/* A NUL byte marks the (premature) end of the string. */
! 			if (*s == '\0')
! 			{
! lex-token_terminator = s;
! report_invalid_token(lex);
! 			}
  			/* Since *s isn't printable, exclude it from the context string */
  			lex-token_terminator = s;
  			ereport(ERROR,
--- 523,545 
  json_lex_string(JsonLexContext *lex)
  {
  	char	   *s;
! 	int len;
  	if (lex-strval != NULL)
  		resetStringInfo(lex-strval);
  
! 	len = lex-token_start - lex-input;
! 	len++;
! 	for (s = lex-token_start + 1; *s != ''; s++, len++)
  	{
! 		/* Premature end of the string. */
! 		if (len = lex-input_length)
  		{
! 			lex-token_terminator = s;
! 			report_invalid_token(lex);
! 		}
! 		else if ((unsigned char) *s  32)
! 		{
! 			/* Per RFC4627, these characters MUST be escaped. */
  			/* Since *s isn't printable, exclude it from the context string */
  			lex-token_terminator = s;
  			ereport(ERROR,
***
*** 547,553  json_lex_string(JsonLexContext *lex)
  		{
  			/* OK, we have an escape character. */
  			s++;
! 			if (*s == '\0')
  			{
  lex-token_terminator = s;
  report_invalid_token(lex);
--- 553,560 
  		{
  			/* OK, we have an escape character. */
  			s++;
! 			len++;
! 			if (len = lex-input_length)
  			{
  lex-token_terminator = s;
  report_invalid_token(lex);
***
*** 560,566  json_lex_string(JsonLexContext *lex)
  for (i = 1; i = 4; i++)
  {
  	s++;
! 	if (*s == '\0')
  	{
  		lex-token_terminator = s;
  		report_invalid_token(lex);
--- 567,574 
  for (i = 1; i = 4; i++)
  {
  	s++;
! 	len++;
! 	if (len = lex-input_length)
  	{
  		lex-token_terminator = s;
  		report_invalid_token(lex);
***
*** 690,696  json_lex_number(JsonLexContext *lex, 

Re: [HACKERS] json api WIP patch

2013-01-08 Thread Andrew Dunstan


On 01/08/2013 03:07 PM, james wrote:



Yes - but I don't think I can use COPY from a stored proc context can 
I?  If I could use binary COPY from a stored proc that has received a 
binary param and unpacked to the data, it would be handy.




You can use COPY from a stored procedure, but only to and from files.

If SPI provided a way to perform a copy to a temp table and then some 
callback on an iterator that yields rows to it, that would do the 
trick I guess.


SPI is useful, but it's certainly possible to avoid its use. After all, 
that what almost the whole backend does, including the COPY code. Of 
course, it's a lot harder to write that way, which is part of why SPI 
exists. Efficiency has its price.



cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 Uhm, we don't have  need palloc support and I don't think
 relpathbackend() is a good justification for adding it.

I've said from the very beginning of this effort that it would be
impossible to share any meaningful amount of code between frontend and
backend environments without adding some sort of emulation of
palloc/pfree/elog.  I think this patch is just making the code uglier
and more fragile to put off the inevitable, and that we'd be better
served to bite the bullet and do that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cascading replication: should we detect/prevent cycles?

2013-01-08 Thread Simon Riggs
On 8 January 2013 19:53, David Fetter da...@fetter.org wrote:
 On Tue, Jan 08, 2013 at 10:46:12AM -0800, Josh Berkus wrote:
 On 1/5/13 1:21 PM, Peter Geoghegan wrote:
 On 21 December 2012 14:08, Robert Haas robertmh...@gmail.com wrote:
 I'm sure it's possible; I don't *think* it's terribly easy.
 
 I'm inclined to agree that this isn't a terribly pressing issue.
 Certainly, the need to introduce a bunch of new infrastructure to
 detect this case seems hard to justify.

 Impossible to justify, I'd say.

 Does anyone have any objections to my adding this to the TODO list,
 in case some clever GSOC student comes up with a way to do it
 *without* adding a bunch of infrastructure?

 I'm pretty sure the logical change stuff Andres et al. are working on
 will be able to include the originating node, which makes cycle
 detection dead simple.

That's different thing really, but I see what you mean.

The problem here is how you tell whether an indirect connection is
connected to the master. It's not just a hard problem its a transient
problem, where any one person's view of the answer might be in the
midst of changing as you measure it. So throwing an error message
might make certain cluster configs inoperable.

I'd prefer to be able to bring up a complex cluster in any order,
rather than in waves of startups all needing synchronisation to avoid
error.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [PATCH 1/5] Centralize Assert* macros into c.h so its common between backend/frontend

2013-01-08 Thread Alvaro Herrera
Andres Freund wrote:
 On 2013-01-08 14:35:12 -0500, Tom Lane wrote:
  Andres Freund and...@2ndquadrant.com writes:
   On 2013-01-08 14:25:06 -0500, Tom Lane wrote:
   This patch seems unnecessary given that we already put a version of 
   Assert()
   into postgres_fe.h.
  
   The problem is that some (including existing) pieces of code need to
   include postgres.h itself, those can't easily include postgres_fe.h as
   well without getting into problems with redefinitions.
  
  There is no place, anywhere, that should be including both.  So I don't
  see the problem.
 
 Sorry, misremembered the problem somewhat. The problem is that code that
 includes postgres.h atm ends up with ExceptionalCondition() et
 al. declared even if FRONTEND is defined. So if anything uses an assert
 you need to provide wrappers for those which seems nasty. If they are
 provided centrally and check for FRONTEND that problem doesn't exist.

I think the right fix here is to fix things so that postgres.h is not
necessary.  How hard is that?  Maybe it just requires some more
reshuffling of xlog headers.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Alvaro Herrera
Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  Uhm, we don't have  need palloc support and I don't think
  relpathbackend() is a good justification for adding it.
 
 I've said from the very beginning of this effort that it would be
 impossible to share any meaningful amount of code between frontend and
 backend environments without adding some sort of emulation of
 palloc/pfree/elog.  I think this patch is just making the code uglier
 and more fragile to put off the inevitable, and that we'd be better
 served to bite the bullet and do that.

As far as this patch is concerned, I think it's sufficient to do

#define palloc(x) malloc(x)
#define pfree(x) free(x)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Andres Freund
On 2013-01-08 15:27:23 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  Uhm, we don't have  need palloc support and I don't think
  relpathbackend() is a good justification for adding it.
 
 I've said from the very beginning of this effort that it would be
 impossible to share any meaningful amount of code between frontend and
 backend environments without adding some sort of emulation of
 palloc/pfree/elog.  I think this patch is just making the code uglier
 and more fragile to put off the inevitable, and that we'd be better
 served to bite the bullet and do that.

If you think relpathbackend() alone warrants that, yes, I can provide a
wrapper. Everything else is imo already handled in a sensible and not
really ugly manner? Imo its not worth the effort *for this alone*.

I already had some elog(), ereport(), whatever emulation but Heikki
preferred not to have it, so its removed by now.

To what extent do you want palloc et al. emulation? Provide actual pools
or just make redirect to malloc and provide the required symbols (at the
very least CurrentMemoryContext)?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [PATCH 1/5] Centralize Assert* macros into c.h so its common between backend/frontend

2013-01-08 Thread Andres Freund
On 2013-01-08 17:36:19 -0300, Alvaro Herrera wrote:
 Andres Freund wrote:
  On 2013-01-08 14:35:12 -0500, Tom Lane wrote:
   Andres Freund and...@2ndquadrant.com writes:
On 2013-01-08 14:25:06 -0500, Tom Lane wrote:
This patch seems unnecessary given that we already put a version of 
Assert()
into postgres_fe.h.
  
The problem is that some (including existing) pieces of code need to
include postgres.h itself, those can't easily include postgres_fe.h as
well without getting into problems with redefinitions.
  
   There is no place, anywhere, that should be including both.  So I don't
   see the problem.
 
  Sorry, misremembered the problem somewhat. The problem is that code that
  includes postgres.h atm ends up with ExceptionalCondition() et
  al. declared even if FRONTEND is defined. So if anything uses an assert
  you need to provide wrappers for those which seems nasty. If they are
  provided centrally and check for FRONTEND that problem doesn't exist.

 I think the right fix here is to fix things so that postgres.h is not
 necessary.  How hard is that?  Maybe it just requires some more
 reshuffling of xlog headers.

I don't really think thats realistic. Think the rmgrdesc/* files and
xlogreader.c itself...

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 To what extent do you want palloc et al. emulation? Provide actual pools
 or just make redirect to malloc and provide the required symbols (at the
 very least CurrentMemoryContext)?

I don't see any need for memory pools, at least not for frontend
applications of the currently envisioned levels of complexity.  I concur
with Alvaro's suggestion about just #define'ing them to malloc/free ---
or maybe better, pg_malloc/free so that we can have a failure-checking
wrapper.

Not sure how we ought to handle elog, but maybe we can put off that bit
of design until we have a more concrete use-case.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Heikki Linnakangas

On 08.01.2013 22:39, Andres Freund wrote:

On 2013-01-08 15:27:23 -0500, Tom Lane wrote:

Andres Freundand...@2ndquadrant.com  writes:

Uhm, we don't have  need palloc support and I don't think
relpathbackend() is a good justification for adding it.


I've said from the very beginning of this effort that it would be
impossible to share any meaningful amount of code between frontend and
backend environments without adding some sort of emulation of
palloc/pfree/elog.  I think this patch is just making the code uglier
and more fragile to put off the inevitable, and that we'd be better
served to bite the bullet and do that.


If you think relpathbackend() alone warrants that, yes, I can provide a
wrapper. Everything else is imo already handled in a sensible and not
really ugly manner? Imo its not worth the effort *for this alone*.

I already had some elog(), ereport(), whatever emulation but Heikki
preferred not to have it, so its removed by now.

To what extent do you want palloc et al. emulation? Provide actual pools
or just make redirect to malloc and provide the required symbols (at the
very least CurrentMemoryContext)?


Note that the xlogreader facility doesn't need any more emulation. I'm 
quite satisfied with that part of the patch now. However, the rmgr desc 
routines do, and I'm not very happy with those. Not sure what to do 
about it. As you said, we could add enough infrastructure to make them 
compile in frontend context, or we could try to make them rely less on 
backend functions.


One consideration is that once we have a separate pg_xlogdump utility, I 
expect that to be the most visible consumer of the rmgr desc functions. 
The backend will of course still use those functions in e.g error 
messages, but those don't happen very often. Not sure how that should be 
taken into account in this patch, but I thought I'd mention it.


An rmgr desc routine probably shouldn't be calling elog() even in the 
backend, because you don't want to throw errors in the contexts where 
those routines are called.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [PATCH 1/5] Centralize Assert* macros into c.h so its common between backend/frontend

2013-01-08 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Andres Freund wrote:
 Sorry, misremembered the problem somewhat. The problem is that code that
 includes postgres.h atm ends up with ExceptionalCondition() et
 al. declared even if FRONTEND is defined. So if anything uses an assert
 you need to provide wrappers for those which seems nasty. If they are
 provided centrally and check for FRONTEND that problem doesn't exist.

 I think the right fix here is to fix things so that postgres.h is not
 necessary.  How hard is that?  Maybe it just requires some more
 reshuffling of xlog headers.

That would definitely be the ideal fix.  In general, #include'ing
postgres.h into code that's not backend code opens all kinds of hazards
that are likely to bite us sooner or later; the incompatibility of the
Assert definitions is just the tip of that iceberg.  (In the past we've
had issues around stdio.h providing different definitions in the two
environments, for example.)

But having said that, I'm also now remembering that we have files in
src/port/ and probably elsewhere that try to work in both environments
by just #include'ing c.h directly (relying on the Makefile to supply
-DFRONTEND or not).  If we wanted to support Assert use in such files,
we would have to move at least the Assert() macro definitions into c.h
as Andres is proposing.  Now, I've always thought that #include'ing c.h
directly was kind of an ugly hack, but I don't have a better design than
that to offer ATM.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2013-01-08 Thread Tomas Vondra
On 8.1.2013 03:47, Shigeru Hanada wrote:
 * naming of DROP_RELATIONS_BSEARCH_LIMIT (or off-by-one bug?)
 IIUC bsearch is used when # of relations to be dropped is *more* than
 the value of DROP_RELATIONS_BSEARCH_LIMIT (10).  IMO this behavior is
 not what the macro name implies; I thought that bsearch is used for 10
 relations.  Besides, the word LIMIT is used as *upper limit* in
 other macros.  How about MIN_DROP_REL[ATION]S_BSEARCH or
 DROP_REL[ATION]S_LINEAR_SEARCH_LIMIT?
 # using RELS instead of RELATIONS would be better to shorten the name


 I've changed the name to DROP_RELS_BSEARCH_THRESHOLD. I believe this
 naming is consistent with options like geqo_threshold - when the
 number of relations reaches the specified value, the bsearch is used.

 I've also increased the value from 10 to 20, in accordance with the
 previous discussion.
 
 New name sounds good to me, but the #define says that the value is 15.
 Should it be fixed to 20?

Ah, yes, please increase it to 20. I've increased it from 10 to 15
first, but I think 20 is nearer the optimal value and I forgot to change
it in the code.

 
 * +1 for Alvaro's suggestion about avoiding palloc traffic by starting
 with 8 elements or so.


 Done.
 
 Not yet.  Initial size of srels array is still 1 element.

I've checked the patch and I see there 'maxrels = 8' - or do you mean
something else?


Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Andres Freund
On 2013-01-08 22:47:43 +0200, Heikki Linnakangas wrote:
 On 08.01.2013 22:39, Andres Freund wrote:
 On 2013-01-08 15:27:23 -0500, Tom Lane wrote:
 Andres Freundand...@2ndquadrant.com  writes:
 Uhm, we don't have  need palloc support and I don't think
 relpathbackend() is a good justification for adding it.
 
 I've said from the very beginning of this effort that it would be
 impossible to share any meaningful amount of code between frontend and
 backend environments without adding some sort of emulation of
 palloc/pfree/elog.  I think this patch is just making the code uglier
 and more fragile to put off the inevitable, and that we'd be better
 served to bite the bullet and do that.
 
 If you think relpathbackend() alone warrants that, yes, I can provide a
 wrapper. Everything else is imo already handled in a sensible and not
 really ugly manner? Imo its not worth the effort *for this alone*.
 
 I already had some elog(), ereport(), whatever emulation but Heikki
 preferred not to have it, so its removed by now.
 
 To what extent do you want palloc et al. emulation? Provide actual pools
 or just make redirect to malloc and provide the required symbols (at the
 very least CurrentMemoryContext)?
 
 Note that the xlogreader facility doesn't need any more emulation. I'm quite
 satisfied with that part of the patch now. However, the rmgr desc routines
 do, and I'm not very happy with those. Not sure what to do about it. As you
 said, we could add enough infrastructure to make them compile in frontend
 context, or we could try to make them rely less on backend functions.

Which emulation needs are you missing in the desc routines besides
relpathbackend() and timestamptz_to_str()? pfree() is just needed
because its used to free the result of relpathbackend(). No own pallocs,
no ereport ...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Heikki Linnakangas

On 08.01.2013 23:00, Andres Freund wrote:

Note that the xlogreader facility doesn't need any more emulation. I'm quite
satisfied with that part of the patch now. However, the rmgr desc routines
do, and I'm not very happy with those. Not sure what to do about it. As you
said, we could add enough infrastructure to make them compile in frontend
context, or we could try to make them rely less on backend functions.


Which emulation needs are you missing in the desc routines besides
relpathbackend() and timestamptz_to_str()? pfree() is just needed
because its used to free the result of relpathbackend(). No own pallocs,
no ereport ...


Nothing besides those, those are the stuff I was referring to.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade regression test litters the source tree with log files

2013-01-08 Thread Bruce Momjian
On Tue, Jan  8, 2013 at 01:08:44PM -0500, Tom Lane wrote:
 In a tree in which I previously ran make check in contrib/pg_upgrade:
 
 $ make -s distclean
 $ git status
 # On branch master
 # Untracked files:
 #   (use git add file... to include in what will be committed)
 #
 #   contrib/pg_upgrade/pg_upgrade_dump_1.log
 #   contrib/pg_upgrade/pg_upgrade_dump_12912.log
 #   contrib/pg_upgrade/pg_upgrade_dump_16384.log
 nothing added to commit but untracked files present (use git add to track)
 
 Not sure how long this has been happening.

Those look like files left over from a failed upgrade, or you used
--retain.   Does that make sense?  Because they are tracked by oid, it
is possible a later successful upgrade would not remove all those files,
bit it should remove contrib/pg_upgrade/pg_upgrade_dump_1.log because it
is 1.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade regression test litters the source tree with log files

2013-01-08 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Tue, Jan  8, 2013 at 01:08:44PM -0500, Tom Lane wrote:
 In a tree in which I previously ran make check in contrib/pg_upgrade:
 
 $ make -s distclean
 $ git status
 # On branch master
 # Untracked files:
 #   (use git add file... to include in what will be committed)
 #
 #   contrib/pg_upgrade/pg_upgrade_dump_1.log
 #   contrib/pg_upgrade/pg_upgrade_dump_12912.log
 #   contrib/pg_upgrade/pg_upgrade_dump_16384.log
 nothing added to commit but untracked files present (use git add to track)
 
 Not sure how long this has been happening.

 Those look like files left over from a failed upgrade, or you used
 --retain.   Does that make sense?

It's possible that I had one or more failed regression test runs on that
machine ... don't recall for sure.  In any case the point here is that
make clean ought to get rid of anything that might be left over from a
test run, successful or otherwise.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Andres Freund
On 2013-01-08 15:45:07 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  To what extent do you want palloc et al. emulation? Provide actual pools
  or just make redirect to malloc and provide the required symbols (at the
  very least CurrentMemoryContext)?
 
 I don't see any need for memory pools, at least not for frontend
 applications of the currently envisioned levels of complexity.  I concur
 with Alvaro's suggestion about just #define'ing them to malloc/free ---
 or maybe better, pg_malloc/free so that we can have a failure-checking
 wrapper.

Unless we want to introduce those into common headers, its more complex
than #define's, you actually need to provide at least
palloc/pfree/CurrentMemoryContext symbols.

Still seems like a shame to do that for one lonely pfree() (+ something
an eventual own implementation of relpathbackend().

 Not sure how we ought to handle elog, but maybe we can put off that bit
 of design until we have a more concrete use-case.

Agreed.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade regression test litters the source tree with log files

2013-01-08 Thread Peter Eisentraut
On 1/8/13 4:04 PM, Bruce Momjian wrote:
 On Tue, Jan  8, 2013 at 01:08:44PM -0500, Tom Lane wrote:
 In a tree in which I previously ran make check in contrib/pg_upgrade:

 $ make -s distclean
 $ git status
 # On branch master
 # Untracked files:
 #   (use git add file... to include in what will be committed)
 #
 #   contrib/pg_upgrade/pg_upgrade_dump_1.log
 #   contrib/pg_upgrade/pg_upgrade_dump_12912.log
 #   contrib/pg_upgrade/pg_upgrade_dump_16384.log
 nothing added to commit but untracked files present (use git add to track)

 Not sure how long this has been happening.
 
 Those look like files left over from a failed upgrade, or you used
 --retain.   Does that make sense?  Because they are tracked by oid, it
 is possible a later successful upgrade would not remove all those files,
 bit it should remove contrib/pg_upgrade/pg_upgrade_dump_1.log because it
 is 1.

I think this came in with the pg_upgrade --jobs option.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade regression test litters the source tree with log files

2013-01-08 Thread Bruce Momjian
On Tue, Jan  8, 2013 at 04:08:42PM -0500, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  On Tue, Jan  8, 2013 at 01:08:44PM -0500, Tom Lane wrote:
  In a tree in which I previously ran make check in contrib/pg_upgrade:
  
  $ make -s distclean
  $ git status
  # On branch master
  # Untracked files:
  #   (use git add file... to include in what will be committed)
  #
  #   contrib/pg_upgrade/pg_upgrade_dump_1.log
  #   contrib/pg_upgrade/pg_upgrade_dump_12912.log
  #   contrib/pg_upgrade/pg_upgrade_dump_16384.log
  nothing added to commit but untracked files present (use git add to 
  track)
  
  Not sure how long this has been happening.
 
  Those look like files left over from a failed upgrade, or you used
  --retain.   Does that make sense?
 
 It's possible that I had one or more failed regression test runs on that
 machine ... don't recall for sure.  In any case the point here is that
 make clean ought to get rid of anything that might be left over from a
 test run, successful or otherwise.

That seems like something more for the regression script (test.sh) to
delete.  Those are output by _running_ the program, and I never expected
people to be running it in the git tree.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade regression test litters the source tree with log files

2013-01-08 Thread Bruce Momjian
On Tue, Jan  8, 2013 at 04:11:41PM -0500, Peter Eisentraut wrote:
 On 1/8/13 4:04 PM, Bruce Momjian wrote:
  On Tue, Jan  8, 2013 at 01:08:44PM -0500, Tom Lane wrote:
  In a tree in which I previously ran make check in contrib/pg_upgrade:
 
  $ make -s distclean
  $ git status
  # On branch master
  # Untracked files:
  #   (use git add file... to include in what will be committed)
  #
  #   contrib/pg_upgrade/pg_upgrade_dump_1.log
  #   contrib/pg_upgrade/pg_upgrade_dump_12912.log
  #   contrib/pg_upgrade/pg_upgrade_dump_16384.log
  nothing added to commit but untracked files present (use git add to 
  track)
 
  Not sure how long this has been happening.
  
  Those look like files left over from a failed upgrade, or you used
  --retain.   Does that make sense?  Because they are tracked by oid, it
  is possible a later successful upgrade would not remove all those files,
  bit it should remove contrib/pg_upgrade/pg_upgrade_dump_1.log because it
  is 1.
 
 I think this came in with the pg_upgrade --jobs option.

Yes, it was part of the split to allow creation of per-database SQL
files, but pg_upgrade always created files in the current directory ---
there are just more of them now.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Andres Freund
On 2013-01-08 23:02:15 +0200, Heikki Linnakangas wrote:
 On 08.01.2013 23:00, Andres Freund wrote:
 Note that the xlogreader facility doesn't need any more emulation. I'm quite
 satisfied with that part of the patch now. However, the rmgr desc routines
 do, and I'm not very happy with those. Not sure what to do about it. As you
 said, we could add enough infrastructure to make them compile in frontend
 context, or we could try to make them rely less on backend functions.
 
 Which emulation needs are you missing in the desc routines besides
 relpathbackend() and timestamptz_to_str()? pfree() is just needed
 because its used to free the result of relpathbackend(). No own pallocs,
 no ereport ...

 Nothing besides those, those are the stuff I was referring to.

Yea :(. As I said, I think its reasonable to emulate the former but
timestamptz_to_str() seems too be too complex for that.

Extracting the whole datetime/timestamp handling into a backend
independent module seems like complex overkill to me although it might
be useful to reduce the duplication of all that code in ecpg. (which
deviated quite a bit by now). No idea how to solve that other than
returning placeholder data atm.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] json api WIP patch

2013-01-08 Thread Peter Eisentraut
On 1/7/13 5:15 PM, Andrew Dunstan wrote:
 You (Merlin) have kindly volunteered to work on documentation, so before
 we go too far with that any bikeshedding on names, or on the
 functionality being provided, should now take place.

Hmm, I was going to say, this patch contains no documentation, so I have
no idea what it is supposed to do.  Recently discussed isn't a good
substitute for describing what the patch is supposed to accomplish.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/Python result object str handler

2013-01-08 Thread Peter Eisentraut
On 1/8/13 4:32 AM, Magnus Hagander wrote:
 How does it work if there are many rows in there? Say the result
 contains 10,000 rows - will the string contain all of them? If so,
 might it be worthwhile to cap the number of rows shown and then follow
 with a ... or something?

I don't think so.  Any number you pick will be too low for someone.
Since this would only be executed when explicitly asked for, it's up to
the user to manage this.  It's analogous to print(long_list) -- you
wouldn't truncate that.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/Python result object str handler

2013-01-08 Thread Magnus Hagander
On Tue, Jan 8, 2013 at 10:23 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 1/8/13 4:32 AM, Magnus Hagander wrote:
 How does it work if there are many rows in there? Say the result
 contains 10,000 rows - will the string contain all of them? If so,
 might it be worthwhile to cap the number of rows shown and then follow
 with a ... or something?

 I don't think so.  Any number you pick will be too low for someone.
 Since this would only be executed when explicitly asked for, it's up to
 the user to manage this.  It's analogous to print(long_list) -- you
 wouldn't truncate that.

Fair enough. I was thinking of a specific example when I wrote that,
bu I can't recall what it was, and clearly using print or the python
console would be the most similar scenarios. And they both do it the
way you suggest. So that's probably the right thing to do.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/Python result object str handler

2013-01-08 Thread Peter Eisentraut
On 1/8/13 11:55 AM, Daniele Varrazzo wrote:
 PLyResult status=5 nrows=2 rows=[{'foo': 1, 'bar': '11'}, {'foo': 2, 
 'bar': '22'}]
 This looks more a repr-style format to me (if you implement repr but
 not str, the latter will default to the former).

The repr style was the only guideline I found.  There is no guideline
for how str should look like when it's not repr.  Do you have a better
suggestion for the output format?

(The reason this is str and not repr is that it doesn't contain other
information such as the tuple descriptor, so str of two different
results could easily be the same.)




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2013-01-08 Thread Alvaro Herrera
Tomas Vondra wrote:
 On 8.1.2013 03:47, Shigeru Hanada wrote:

  * +1 for Alvaro's suggestion about avoiding palloc traffic by starting
  with 8 elements or so.
 
  Done.
  
  Not yet.  Initial size of srels array is still 1 element.
 
 I've checked the patch and I see there 'maxrels = 8' - or do you mean
 something else?

Well, you need to ensure that the initial palloc is an array of that
size.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] json api WIP patch

2013-01-08 Thread james

You can use COPY from a stored procedure, but only to and from files.


I think that's in the chocolate fireguard realm though as far as 
efficiency for this sort of scenario goes, even if its handled by 
retaining an mmap'd file as workspace.





If SPI provided a way to perform a copy to a temp table and then some callback 
on an iterator that yields rows to it, that would do the trick I guess.


SPI is useful, but it's certainly possible to avoid its use. After all, that 
what almost the whole backend does, including the COPY code. Of course, it's a 
lot harder to write that way, which is part of why SPI exists. Efficiency has 
its price.


So it is possible to use a lower level interface from a C stored proc? 
SPI is the (only) documented direct function extension API isn't it?


Is the issue with using the JSON data-to-record set that the parsing can 
be costly?  Perhaps it can be achieved with B64 of compressed protobuf, 
or such.  I don't mind if it seems a bit messy - the code can be 
generated from the table easily enough, especially if I can use C++.  I 
guess an allocator that uses SPI_palloc would solve issues with memory 
management on error?




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] json api WIP patch

2013-01-08 Thread Merlin Moncure
On Tue, Jan 8, 2013 at 3:19 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 1/7/13 5:15 PM, Andrew Dunstan wrote:
 You (Merlin) have kindly volunteered to work on documentation, so before
 we go too far with that any bikeshedding on names, or on the
 functionality being provided, should now take place.

 Hmm, I was going to say, this patch contains no documentation, so I have
 no idea what it is supposed to do.  Recently discussed isn't a good
 substitute for describing what the patch is supposed to accomplish.

Why not?  There are functional examples in the docs and the purpose of
the various functions was hashed out pretty well a couple weeks back,
deficiencies corrected, etc.

reference: http://postgresql.1045698.n5.nabble.com/json-accessors-td5733929.html

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2013-01-08 Thread Tomas Vondra
On 8.1.2013 22:30, Alvaro Herrera wrote:
 Tomas Vondra wrote:
 On 8.1.2013 03:47, Shigeru Hanada wrote:
 
 * +1 for Alvaro's suggestion about avoiding palloc traffic by starting
 with 8 elements or so.

 Done.

 Not yet.  Initial size of srels array is still 1 element.

 I've checked the patch and I see there 'maxrels = 8' - or do you mean
 something else?
 
 Well, you need to ensure that the initial palloc is an array of that
 size.

Oh, right - I forgot to modify the palloc() call. Thanks for spotting
this. Attached is a patch with increased threshold and fixed palloc call.

Tomas
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 8dc622a..b1790eb 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -312,6 +312,11 @@ smgrDoPendingDeletes(bool isCommit)
 	PendingRelDelete *pending;
 	PendingRelDelete *prev;
 	PendingRelDelete *next;
+
+	int			nrels = 0,
+i = 0,
+maxrels = 8;
+	SMgrRelation   *srels = palloc(maxrels * sizeof(SMgrRelation));
 
 	prev = NULL;
 	for (pending = pendingDeletes; pending != NULL; pending = next)
@@ -335,14 +340,32 @@ smgrDoPendingDeletes(bool isCommit)
 SMgrRelation srel;
 
 srel = smgropen(pending-relnode, pending-backend);
-smgrdounlink(srel, false);
-smgrclose(srel);
+
+/* extend the array if needed (double the size) */
+if (maxrels = nrels)
+{
+	maxrels *= 2;
+	srels = repalloc(srels, sizeof(SMgrRelation) * maxrels);
+}
+
+srels[nrels++] = srel;
 			}
 			/* must explicitly free the list entry */
 			pfree(pending);
 			/* prev does not change */
 		}
 	}
+
+	if (nrels  0)
+	{
+		smgrdounlinkall(srels, nrels, false);
+
+		for (i = 0; i  nrels; i++)
+			smgrclose(srels[i]);
+	}
+
+	pfree(srels);
+
 }
 
 /*
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index dddb6c0..2330fda 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -62,6 +62,7 @@
 #define BUF_WRITTEN0x01
 #define BUF_REUSABLE			0x02
 
+#define DROP_RELS_BSEARCH_THRESHOLD		20
 
 /* GUC variables */
 bool		zero_damaged_pages = false;
@@ -108,6 +109,7 @@ static volatile BufferDesc *BufferAlloc(SMgrRelation smgr,
 static void FlushBuffer(volatile BufferDesc *buf, SMgrRelation reln);
 static void AtProcExit_Buffers(int code, Datum arg);
 
+static int rnode_comparator(const void * p1, const void * p2);
 
 /*
  * PrefetchBuffer -- initiate asynchronous read of a block of a relation
@@ -2094,35 +2096,87 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
  * 
  */
 void
-DropRelFileNodeAllBuffers(RelFileNodeBackend rnode)
+DropRelFileNodeAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
 {
-	int			i;
+	int i, j, n = 0;
+	RelFileNode *nodes;
+
+	nodes = palloc(sizeof(RelFileNode) * nnodes); /* non-local relations */
 
 	/* If it's a local relation, it's localbuf.c's problem. */
-	if (RelFileNodeBackendIsTemp(rnode))
+	for (i = 0; i  nnodes; i++)
 	{
-		if (rnode.backend == MyBackendId)
-			DropRelFileNodeAllLocalBuffers(rnode.node);
+		if (RelFileNodeBackendIsTemp(rnodes[i]))
+		{
+			if (rnodes[i].backend == MyBackendId)
+DropRelFileNodeAllLocalBuffers(rnodes[i].node);
+		}
+		else
+		{
+			nodes[n++] = rnodes[i].node;
+		}
+	}
+
+	/*
+	 * If there are no non-local relations, then we're done. Release the memory
+	 * and return.
+	 */
+	if (n == 0)
+	{
+		pfree(nodes);
 		return;
 	}
 
+	/* sort the list of rnodes (but only if we're going to use the bsearch) */
+	if (n  DROP_RELS_BSEARCH_THRESHOLD)
+		pg_qsort(nodes, n, sizeof(RelFileNode), rnode_comparator);
+
 	for (i = 0; i  NBuffers; i++)
 	{
+		RelFileNode *rnode = NULL;
 		volatile BufferDesc *bufHdr = BufferDescriptors[i];
-
+ 
 		/*
 		 * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
 		 * and saves some cycles.
 		 */
-		if (!RelFileNodeEquals(bufHdr-tag.rnode, rnode.node))
+
+		/*
+		 * For low number of relations to drop just use a simple walk through,
+		 * to save the bsearch overhead. The BSEARCH_LIMIT is rather a guess
+		 * than a exactly determined value, as it depends on many factors (CPU
+		 * and RAM speeds, amount of shared buffers etc.).
+		 */
+		if (n  DROP_RELS_BSEARCH_THRESHOLD)
+		{
+			for (j = 0; j  n; j++)
+			{
+if (RelFileNodeEquals(bufHdr-tag.rnode, nodes[j]))
+{
+	rnode = nodes[j];
+	break;
+}
+			}
+		}
+		else
+		{
+			rnode = bsearch((const void *) (bufHdr-tag.rnode),
+			nodes, n, sizeof(RelFileNode),
+			rnode_comparator);
+		}
+
+		/* buffer does not belong to any of the relations */
+		if (rnode == NULL)
 			continue;
 
 		LockBufHdr(bufHdr);
-		if (RelFileNodeEquals(bufHdr-tag.rnode, rnode.node))
+		if (RelFileNodeEquals(bufHdr-tag.rnode, (*rnode)))
 			InvalidateBuffer(bufHdr);	/* releases spinlock */
 		else
 			UnlockBufHdr(bufHdr);
 	}
+
+	pfree(nodes);
 }
 
 /* 

Re: [HACKERS] Weird Assert failure in GetLockStatusData()

2013-01-08 Thread Tom Lane
I wrote:
 This is a bit disturbing:
 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bushpigdt=2013-01-07%2019%3A15%3A02
 ...
 The assertion failure seems to indicate that the number of
 LockMethodProcLockHash entries found by hash_seq_search didn't match the
 number that had been counted by hash_get_num_entries immediately before
 that.  I don't see any bug in GetLockStatusData itself, so this suggests
 that there's something wrong with dynahash's entry counting, or that
 somebody somewhere is modifying the shared hash table without holding
 the appropriate lock.  The latter seems a bit more likely, given that
 this must be a very low-probability bug or we'd have seen it before.
 An overlooked locking requirement in a seldom-taken code path would fit
 the symptoms.

After digging around a bit, I can find only one place where it looks
like somebody might be messing with the LockMethodProcLockHash table
while not holding the appropriate lock-partition LWLock(s):

1. VirtualXactLock finds target xact holds its VXID lock fast-path.
2. VirtualXactLock calls SetupLockInTable to convert the fast-path lock
   to regular.
3. SetupLockInTable makes entries in LockMethodLockHash and
   LockMethodProcLockHash.

I see no partition lock acquisition anywhere in the above code path.
Is there one that I'm missing?  Why isn't SetupLockInTable documented
as expecting the caller to hold the partition lock, as is generally
done for lock.c subroutines that require that?

If this is a bug, it's rather disturbing that it took us this long to
recognize it.  That code path isn't all that seldom-taken, AFAIK.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] json api WIP patch

2013-01-08 Thread Andrew Dunstan


On 01/08/2013 04:32 PM, Merlin Moncure wrote:

On Tue, Jan 8, 2013 at 3:19 PM, Peter Eisentrautpete...@gmx.net  wrote:

On 1/7/13 5:15 PM, Andrew Dunstan wrote:

You (Merlin) have kindly volunteered to work on documentation, so before
we go too far with that any bikeshedding on names, or on the
functionality being provided, should now take place.

Hmm, I was going to say, this patch contains no documentation, so I have
no idea what it is supposed to do.  Recently discussed isn't a good
substitute for describing what the patch is supposed to accomplish.

Why not?  There are functional examples in the docs and the purpose of
the various functions was hashed out pretty well a couple weeks back,
deficiencies corrected, etc.

reference:http://postgresql.1045698.n5.nabble.com/json-accessors-td5733929.html


Well, at a high level the patch is meant to do two things: provide an 
API that can be used to build JSON processing functions easily, and 
provide some basic json processing functions built on the API. Those 
functions provide similar capabilities to the accessor functions that 
hstore has.


Perhaps also this will help. Here is the list of functions and operators 
as currently implemented. I also have working operators for the get_path 
functions which will be in a future patch.


All these are used in the included regression tests.


  Name   | Result data type |   
Argument data types

-+--+

 json_array_length   | integer  | json

 json_each   | SETOF record | from_json json, OUT key text, OUT 
value json

 json_each_as_text   | SETOF record | from_json json, OUT key text, OUT 
value text

 json_get| json | json, integer

 json_get| json | json, text

 json_get_as_text| text | json, integer

 json_get_as_text| text | json, text

 json_get_path   | json | from_json json, VARIADIC 
path_elems text[]

 json_get_path_as_text   | text | from_json json, VARIADIC 
path_elems text[]

 json_object_keys| SETOF text   | json

 json_populate_record| anyelement   | anyelement, json

 json_populate_recordset | SETOF anyelement | base anyelement, from_json json, 
use_json_as_text boolean DEFAULT false

 json_unnest | SETOF json   | from_json json, OUT value json




 Name | Left arg type | Right arg type | Result type |  Description

--+---++-+

 -   | json  | integer| json| get json array element

 -   | json  | text   | json| get json object field

 -  | json  | integer| text| get json array element 
as text

 -  | json  | text   | text| get json object field as 
text



cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cascading replication: should we detect/prevent cycles?

2013-01-08 Thread Daniel Farina
On Tue, Jan 8, 2013 at 11:51 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 8 January 2013 18:46, Josh Berkus j...@agliodbs.com wrote:
 On 1/5/13 1:21 PM, Peter Geoghegan wrote:

 On 21 December 2012 14:08, Robert Haas robertmh...@gmail.com wrote:

 I'm sure it's possible; I don't *think* it's terribly easy.


 I'm inclined to agree that this isn't a terribly pressing issue.
 Certainly, the need to introduce a bunch of new infrastructure to
 detect this case seems hard to justify.


 Impossible to justify, I'd say.

 Does anyone have any objections to my adding this to the TODO list, in case
 some clever GSOC student comes up with a way to do it *without* adding a
 bunch of infrastructure?

 Daniel already did object

To briefly reiterate my objection, I observed that one may want to
enter a case of cyclicality on a temporary basis -- to assist with
some intermediate states in remastering, and it'd be nice if Postgres
didn't try to get in the way of that.

I would like to have enough reporting to be able to write tools that
detect cyclicity and other configuration error, and I think that may
exist already in recovery.conf/its successor in postgresql.conf.  A
notable problem here is that UDFs, by their mechanical nature, don't
quite cover all the use cases, as they require the server to be
running and available for hot standby to run.  It seems like reading
recovery.conf or its successor is probably the best option here.

--
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposal: Store timestamptz of database creation on pg_database

2013-01-08 Thread Peter Eisentraut
On 1/5/13 11:04 AM, Stephen Frost wrote:
 Creating a separate catalog (or two) every time we want to track XYZ for
 all objects is rather overkill...  Thinking about this a bit more, and
 noting that pg_description/shdescription more-or-less already exist as a
 framework for tracking 'something' for 'all catalog entries'- why don't
 we just add these columns to those tables..?  This would also address
 Peter's concern about making sure we do this 'wholesale' and in one
 release rather than spread across multiple releases- just make sure it
 covers the same set of things which 'comment' does.

Yeah, actually, the other day I was thinking we should get rid of all
the system catalogs and use a big EAV-like schema instead.  We're not
getting any relational-database value out of the current way, and it's
just a lot of duplicate code.  If we had a full EAV system, we could
even do in-place upgrade.

Obviously, this isn't going to happen any time soon or ever, but I think
I agree with your concern above as a partial step.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Index build temp files

2013-01-08 Thread Stephen Frost
Greetings,

  We were surprised recently to note that the temp files that are
  created during a CREATE INDEX don't go into either a temp tablespace
  set for the user or into the tablespace which the CREATE INDEX
  specifies.  Instead, they go only into base/pgsql_tmp/.  This doesn't
  allow for any flexibility in defining where to create these
  potentially quite large sets of files.

  Shouldn't these temp files be going into the temp tablespace for the
  user creating the index instead..?  Or perhaps into the tablespace
  which the index is being created in?

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] I s this a bug of spgist index in a heavy write condition?

2013-01-08 Thread Tom Lane
=?gb2312?B?wO66o8H6?= hailong...@qunar.com writes:
 I am very excited to say that I may have created a test case!

I've been running variants of this example for most of the afternoon,
and have not seen a failure :-(.  So I'm afraid there is some aspect
of your situation that you've not provided enough information to
reproduce.  Most likely, that's the initial contents of your table,
which you didn't provide.  I tried seeding the table with the five
values you did show and then running the insertion loops, but no luck,
even after many millions of insertions with various timing changes.

Please see if you can put together a self-contained test case including
necessary test data.  (Note there's no reason to think that any of the
columns other than the spgist-indexed one are interesting, if that helps
you sanitize the data to the point you can let it out.)

The control flow in spgdoinsert.c is flat enough that the stack trace
alone isn't much help in understanding the bug, I'm afraid.  We can
guess that two insertions are trying to lock the same two index pages in
opposite orders, but without looking at the data that doesn't put us
much closer to a fix.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposal: Store timestamptz of database creation on pg_database

2013-01-08 Thread Pavel Stehule
2013/1/8 Peter Eisentraut pete...@gmx.net:
 On 1/5/13 11:04 AM, Stephen Frost wrote:
 Creating a separate catalog (or two) every time we want to track XYZ for
 all objects is rather overkill...  Thinking about this a bit more, and
 noting that pg_description/shdescription more-or-less already exist as a
 framework for tracking 'something' for 'all catalog entries'- why don't
 we just add these columns to those tables..?  This would also address
 Peter's concern about making sure we do this 'wholesale' and in one
 release rather than spread across multiple releases- just make sure it
 covers the same set of things which 'comment' does.

 Yeah, actually, the other day I was thinking we should get rid of all
 the system catalogs and use a big EAV-like schema instead.  We're not
 getting any relational-database value out of the current way, and it's
 just a lot of duplicate code.  If we had a full EAV system, we could
 even do in-place upgrade.


-1

now we have a thousands tables, I am not sure so EAV can get good performance

Pavel

 Obviously, this isn't going to happen any time soon or ever, but I think
 I agree with your concern above as a partial step.



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-08 Thread Robert Haas
On Tue, Jan 8, 2013 at 9:51 AM, Claudio Freire klaussfre...@gmail.com wrote:
 On Tue, Jan 8, 2013 at 10:20 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jan 8, 2013 at 4:04 AM, Takeshi Yamamuro
 yamamuro.take...@lab.ntt.co.jp wrote:
 Apart from my patch, what I care is that the current one might
 be much slow against I/O. For example, when compressing
 and writing large values, compressing data (20-40MiB/s) might be
 a dragger against writing data in disks (50-80MiB/s). Moreover,
 IMHO modern (and very fast) I/O subsystems such as SSD make a
 bigger issue in this case.

 What about just turning compression off?

 I've been relying on compression for some big serialized blob fields
 for some time now. I bet I'm not alone, lots of people save serialized
 data to text fields. So rather than removing it, I'd just change the
 default to off (if that was the decision).

 However, it might be best to evaluate some of the modern fast
 compression schemes like snappy/lz4 (250MB/s per core sounds pretty
 good), and implement pluggable compression schemes instead. Snappy
 wasn't designed for nothing, it was most likely because it was
 necessary. Cassandra (just to name a system I'm familiar with) started
 without compression, and then it was deemed necessary to the point
 they invested considerable time into it. I've always found the fact
 that pg does compression of toast tables quite forward-thinking, and
 I'd say the feature has to remain there, extended and modernized,
 maybe off by default, but there.

I'm not offering any opinion on whether we should have compression as
a general matter.  Maybe yes, maybe no, but my question was about the
OP's use case.  If he's willing to accept less efficient compression
in order to get faster compression, perhaps he should just not use
compression at all.

Personally, my biggest gripe about the way we do compression is that
it's easy to detoast the same object lots of times.  More generally,
our in-memory representation of user data values is pretty much a
mirror of our on-disk representation, even when that leads to excess
conversions.  Beyond what we do for TOAST, there's stuff like numeric
where not only toast but then post-process the results into yet
another internal form before performing any calculations - and then of
course we have to convert back before returning from the calculation
functions.  And for things like XML, JSON, and hstore we have to
repeatedly parse the string, every time someone wants to do anything
to do.  Of course, solving this is a very hard problem, and not
solving it isn't a reason not to have more compression options - but
more compression options will not solve the problems that I personally
have in this area, by and large.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index build temp files

2013-01-08 Thread Bruce Momjian
On Tue, Jan  8, 2013 at 05:09:47PM -0500, Stephen Frost wrote:
 Greetings,
 
   We were surprised recently to note that the temp files that are
   created during a CREATE INDEX don't go into either a temp tablespace
   set for the user or into the tablespace which the CREATE INDEX
   specifies.  Instead, they go only into base/pgsql_tmp/.  This doesn't
   allow for any flexibility in defining where to create these
   potentially quite large sets of files.
 
   Shouldn't these temp files be going into the temp tablespace for the
   user creating the index instead..?  Or perhaps into the tablespace
   which the index is being created in?

Well, our docs for temp_tablespaces says:

This variable specifies tablespaces in which to create temporary
objects (temp tables and indexes on temp tables) when a
commandCREATE/ command does not explicitly specify a tablespace.
Temporary files for purposes such as sorting large data sets
are also created in these tablespaces.

Are you saying this is inaccorate?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposal: Store timestamptz of database creation on pg_database

2013-01-08 Thread Stephen Frost
* Pavel Stehule (pavel.steh...@gmail.com) wrote:
 2013/1/8 Peter Eisentraut pete...@gmx.net:
  On 1/5/13 11:04 AM, Stephen Frost wrote:
  Yeah, actually, the other day I was thinking we should get rid of all
  the system catalogs and use a big EAV-like schema instead.  We're not
  getting any relational-database value out of the current way, and it's
  just a lot of duplicate code.  If we had a full EAV system, we could
  even do in-place upgrade.
 
 
 -1
 
 now we have a thousands tables, I am not sure so EAV can get good performance

To be honest, my first reaction to this was an assumption that it was
pure sarcasm..

Seriously tho, the argument for not putting these things into the
various individual catalogs is that they'd create bloat and these items
don't need to be performant.  I would think that the kind of timestamps
that we're talking about fall into the same data category as comments on
tables.

If there isn't a good reason for comments on objects to be off in a
generic this is for any kind of object table, then perhaps we should
move them into the appropriate catalog tables?

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Index build temp files

2013-01-08 Thread Stephen Frost
* Bruce Momjian (br...@momjian.us) wrote:
 On Tue, Jan  8, 2013 at 05:09:47PM -0500, Stephen Frost wrote:
  Greetings,
  
We were surprised recently to note that the temp files that are
created during a CREATE INDEX don't go into either a temp tablespace
set for the user or into the tablespace which the CREATE INDEX
specifies.  Instead, they go only into base/pgsql_tmp/.  This doesn't
allow for any flexibility in defining where to create these
potentially quite large sets of files.
  
Shouldn't these temp files be going into the temp tablespace for the
user creating the index instead..?  Or perhaps into the tablespace
which the index is being created in?
 
 Well, our docs for temp_tablespaces says:
 
 This variable specifies tablespaces in which to create temporary
 objects (temp tables and indexes on temp tables) when a
 commandCREATE/ command does not explicitly specify a tablespace.
 Temporary files for purposes such as sorting large data sets
 are also created in these tablespaces.
 
 Are you saying this is inaccorate?

Yes and no?  Are the temporary files created during a CREATE INDEX
considered Temporary files for purposes such as sorting large data
sets?  My thinking is 'yes', but others may disagree.  Also,
considering this a bug would imply that it's back-patchable and I'm not
convinced it's worth the risk of breaking things which depend on the
current behavior.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-01-08 Thread Robert Haas
On Tue, Jan 8, 2013 at 9:53 AM, Claudio Freire klaussfre...@gmail.com wrote:
 On Tue, Jan 8, 2013 at 11:39 AM, Merlin Moncure mmonc...@gmail.com wrote:
 Reference: 
 http://postgresql.1045698.n5.nabble.com/Simple-join-doesn-t-use-index-td5738689.html

 This is a pretty common gotcha: user sets shared_buffers but misses
 the esoteric but important effective_cache_size.  ISTM
 effective_cache_size should always be = shared buffers -- this is a
 soft configuration error that could be reported as a warning and
 perhaps overridden on the fly.

 Not true. If there are many concurrent users running concurrent
 queries against parallel databases, such as some test systems I have
 that contain many databases for many test environments, such a setting
 wouldn't make sense. If a DBA sets it to lower than shared_buffers,
 that setting has to be honored.

 Rather, I'd propose the default setting should be -1 or something
 default and automagic that works most of the time (but not all).

+1.  I've found that a value of three-quarters of system memory works
pretty well most of the time.  Of course, there's not a single,
portable way of getting that on every platform we support.  If I
remember my last investigation into this area, to use that particular
rule we would probably need at least three paths - one for Windows,
one for System V descendents, and one for BSD descendents.  And there
might still be obscure things that wouldn't be covered.  Of course
this also makes the admittedly unwarranted assumption that the
database owns the box, which could be wrong too, but purposely
lowballing effective_cache_size to discourage index-scan plans seems
unlikely to be a winning strategy.

A cruder heuristic that might be useful is 3 * shared_buffers.  If
people follow the guidance to set shared_buffers around 25% of RAM,
then this will work out to around 75% again.  Of course, many people,
for valid reasons, use smaller values of shared_buffers than that, and
a few use larger ones.  It might still be better than no auto-tuning,
though I wouldn't swear to it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Robert Haas
On Tue, Jan 8, 2013 at 3:00 PM, Andres Freund and...@2ndquadrant.com wrote:
 Uhm, we don't have  need palloc support and I don't think
 relpathbackend() is a good justification for adding it.

FWIW, I'm with Tom on this one.  Any meaningful code sharing is going
to need that, so we might as well bite the bullet.

And functions that return static buffers are evil incarnate.  I've
spent way too much of my life dealing with the supreme idiocy that is
fmtId().  If someone ever finds a way to make that go away, I will buy
them a beverage of their choice at the next conference we're both at.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Weird Assert failure in GetLockStatusData()

2013-01-08 Thread Tom Lane
I wrote:
 After digging around a bit, I can find only one place where it looks
 like somebody might be messing with the LockMethodProcLockHash table
 while not holding the appropriate lock-partition LWLock(s):

 1. VirtualXactLock finds target xact holds its VXID lock fast-path.
 2. VirtualXactLock calls SetupLockInTable to convert the fast-path lock
to regular.
 3. SetupLockInTable makes entries in LockMethodLockHash and
LockMethodProcLockHash.

 I see no partition lock acquisition anywhere in the above code path.

I've been able to reproduce the assertion crash by forcing the
appropriate timing with some carefully chosen debugger breakpoints.
So this theory is evidently correct.

 If this is a bug, it's rather disturbing that it took us this long to
 recognize it.  That code path isn't all that seldom-taken, AFAIK.

On closer inspection, VirtualXactLock() is only called in CREATE INDEX
CONCURRENTLY and DROP INDEX CONCURRENTLY (and yes, the failed test case
on bushpig was exercising DROP INDEX CONCURRENTLY).  So maybe the path
isn't that frequently taken after all.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Alvaro Herrera
Robert Haas escribió:

 And functions that return static buffers are evil incarnate.  I've
 spent way too much of my life dealing with the supreme idiocy that is
 fmtId().

+1

 If someone ever finds a way to make that go away, I will buy
 them a beverage of their choice at the next conference we're both at.

+1

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Andres Freund
On 2013-01-08 17:28:33 -0500, Robert Haas wrote:
 On Tue, Jan 8, 2013 at 3:00 PM, Andres Freund and...@2ndquadrant.com wrote:
  Uhm, we don't have  need palloc support and I don't think
  relpathbackend() is a good justification for adding it.

 FWIW, I'm with Tom on this one.  Any meaningful code sharing is going
 to need that, so we might as well bite the bullet.

Yes, if we set the scope bigger than xlogreader I aggree. If its
xlogreader itself I don't. But as I happen to think we should share more
code...
Will prepare a patch.

I wonder whether it would be acceptable to make palloc() an actual
function instead of

#define palloc(sz)  MemoryContextAlloc(CurrentMemoryContext, (sz))

so we don't have to expose CurrentMemoryContext?

Alternatively we can just move the whole of utils/mmgr/* to port, but
that would imply an elog/ereport wrapper...

 And functions that return static buffers are evil incarnate.  I've
 spent way too much of my life dealing with the supreme idiocy that is
 fmtId().  If someone ever finds a way to make that go away, I will buy
 them a beverage of their choice at the next conference we're both at.

Imo it depends on the circumstances and number of possible callers, but
anyway, it seems to be already decided that my suggestion isn't the way
to go.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-01-08 Thread Bruce Momjian
On Tue, Jan  8, 2013 at 05:23:36PM -0500, Robert Haas wrote:
  Rather, I'd propose the default setting should be -1 or something
  default and automagic that works most of the time (but not all).
 
 +1.  I've found that a value of three-quarters of system memory works
 pretty well most of the time.  Of course, there's not a single,
 portable way of getting that on every platform we support.  If I
 remember my last investigation into this area, to use that particular
 rule we would probably need at least three paths - one for Windows,
 one for System V descendents, and one for BSD descendents.  And there
 might still be obscure things that wouldn't be covered.  Of course
 this also makes the admittedly unwarranted assumption that the
 database owns the box, which could be wrong too, but purposely
 lowballing effective_cache_size to discourage index-scan plans seems
 unlikely to be a winning strategy.
 
 A cruder heuristic that might be useful is 3 * shared_buffers.  If
 people follow the guidance to set shared_buffers around 25% of RAM,
 then this will work out to around 75% again.  Of course, many people,
 for valid reasons, use smaller values of shared_buffers than that, and
 a few use larger ones.  It might still be better than no auto-tuning,
 though I wouldn't swear to it.

Agreed.  This is similar to the fudge we have about random_page_cost:

Random access to mechanical disk storage is normally much more expensive
than four-times sequential access.  However, a lower default is used
(4.0) because the majority of random accesses to disk, such as indexed
reads, are assumed to be in cache.  The default value can be thought of
as modeling random access as 40 times slower than sequential, while
expecting 90% of random reads to be cached.

effective_cache_size is impossible to set accurately because you have no
idea what other things might be in the cache, or what other concurrent
queries might be filling the cache.  Going with something at least
partly reasonable makes a lot more sense.  While we don't know the size
of RAM, we do know the size of shared_buffers, and keying on that for a
default seems like a no-brainer, rather than using 128MB.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 And functions that return static buffers are evil incarnate.  I've
 spent way too much of my life dealing with the supreme idiocy that is
 fmtId().  If someone ever finds a way to make that go away, I will buy
 them a beverage of their choice at the next conference we're both at.

Yeah, that was exactly the case that was top-of-mind when I was
complaining about static return buffers upthread.

It's not hard to make the ugliness go away: just let it strdup its
return value.  The problem is that in the vast majority of usages it
wouldn't be convenient to free the result, so we'd have a good deal
of memory leakage.  What might be interesting is to instrument it to
see how much (adding a counter to the function ought to be easy enough)
and then find out whether it's an amount we still care about in 2013.
Frankly, pg_dump is a memory hog already - a few more identifier-sized
strings laying about might not matter anymore.

(Wanders away wondering how many relpathbackend callers bother to free
its result, and whether that matters either ...)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index build temp files

2013-01-08 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Bruce Momjian (br...@momjian.us) wrote:
 Well, our docs for temp_tablespaces says:
 This variable specifies tablespaces in which to create temporary
 objects (temp tables and indexes on temp tables) when a
 commandCREATE/ command does not explicitly specify a tablespace.
 Temporary files for purposes such as sorting large data sets
 are also created in these tablespaces.
 
 Are you saying this is inaccorate?

 Yes and no?  Are the temporary files created during a CREATE INDEX
 considered Temporary files for purposes such as sorting large data
 sets?  My thinking is 'yes', but others may disagree.  Also,
 considering this a bug would imply that it's back-patchable and I'm not
 convinced it's worth the risk of breaking things which depend on the
 current behavior.

I don't think it's a bug.  What you seem to be proposing is that CREATE
INDEX ought to ignore temp_tablespaces and instead always put its temp
files in the tablespace where the finished index will reside.  This
would not be a good idea IMO --- allowing the temp files to be spread to
other tablespaces is better both from the standpoint of space usage and
the ability to overlap I/O.  (Admittedly, both of those concerns are
often theoretical, but if they are then I don't see why you'd care which
tablespace is used.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index build temp files

2013-01-08 Thread Simon Riggs
On 9 January 2013 00:04, Tom Lane t...@sss.pgh.pa.us wrote:

 I don't think it's a bug.  What you seem to be proposing is that CREATE
 INDEX ought to ignore temp_tablespaces and instead always put its temp
 files in the tablespace where the finished index will reside.

I don't think that's what he said.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-01-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jan 8, 2013 at 9:53 AM, Claudio Freire klaussfre...@gmail.com wrote:
 Rather, I'd propose the default setting should be -1 or something
 default and automagic that works most of the time (but not all).

 A cruder heuristic that might be useful is 3 * shared_buffers.

Both parts of that work for me.  It's certainly silly that the default
value of effective_cache_size is now equivalent to the default value
of shared_buffers.  And I don't especially like the idea of trying to
make it depend directly on the box's physical RAM, for the same
practical reasons Robert mentioned.

It might be better to use 4 * shared_buffers, as that corresponds to the
multiple that's been the default since 8.2 or so (ie 128MB vs 32MB), and
3x just seems kinda oddball.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index build temp files

2013-01-08 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On 9 January 2013 00:04, Tom Lane t...@sss.pgh.pa.us wrote:
 I don't think it's a bug.  What you seem to be proposing is that CREATE
 INDEX ought to ignore temp_tablespaces and instead always put its temp
 files in the tablespace where the finished index will reside.

 I don't think that's what he said.

Then I misunderstood.  Stephen, what was your thought exactly about
what should happen?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Robert Haas
On Tue, Jan 8, 2013 at 6:23 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 And functions that return static buffers are evil incarnate.  I've
 spent way too much of my life dealing with the supreme idiocy that is
 fmtId().  If someone ever finds a way to make that go away, I will buy
 them a beverage of their choice at the next conference we're both at.

 Yeah, that was exactly the case that was top-of-mind when I was
 complaining about static return buffers upthread.

 It's not hard to make the ugliness go away: just let it strdup its
 return value.  The problem is that in the vast majority of usages it
 wouldn't be convenient to free the result, so we'd have a good deal
 of memory leakage.  What might be interesting is to instrument it to
 see how much (adding a counter to the function ought to be easy enough)
 and then find out whether it's an amount we still care about in 2013.
 Frankly, pg_dump is a memory hog already - a few more identifier-sized
 strings laying about might not matter anymore.

 (Wanders away wondering how many relpathbackend callers bother to free
 its result, and whether that matters either ...)

I was thinking more about a sprintf()-type function that only
understands a handful of escapes, but adds the additional and novel
escapes %I (quote as identifier) and %L (quote as literal).  I think
that would allow a great deal of code simplification, and it'd be more
efficient, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-01-08 Thread Robert Haas
On Tue, Jan 8, 2013 at 7:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Jan 8, 2013 at 9:53 AM, Claudio Freire klaussfre...@gmail.com 
 wrote:
 Rather, I'd propose the default setting should be -1 or something
 default and automagic that works most of the time (but not all).

 A cruder heuristic that might be useful is 3 * shared_buffers.

 Both parts of that work for me.  It's certainly silly that the default
 value of effective_cache_size is now equivalent to the default value
 of shared_buffers.  And I don't especially like the idea of trying to
 make it depend directly on the box's physical RAM, for the same
 practical reasons Robert mentioned.

For the record, I don't believe those problems would be particularly
hard to solve.

 It might be better to use 4 * shared_buffers, as that corresponds to the
 multiple that's been the default since 8.2 or so (ie 128MB vs 32MB), and
 3x just seems kinda oddball.

I suspect that would be OK, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I was thinking more about a sprintf()-type function that only
 understands a handful of escapes, but adds the additional and novel
 escapes %I (quote as identifier) and %L (quote as literal).  I think
 that would allow a great deal of code simplification, and it'd be more
 efficient, too.

Seems like a great idea.  Are you offering to code it?

Note that this wouldn't entirely fix the fmtId problem, as not all the
uses of fmtId are directly in sprintf calls.  Still, it might get rid of
most of the places where it'd be painful to avoid a memory leak with
a strdup'ing version of fmtId.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

2013-01-08 Thread Robert Haas
On Tue, Jan 8, 2013 at 7:57 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I was thinking more about a sprintf()-type function that only
 understands a handful of escapes, but adds the additional and novel
 escapes %I (quote as identifier) and %L (quote as literal).  I think
 that would allow a great deal of code simplification, and it'd be more
 efficient, too.

 Seems like a great idea.  Are you offering to code it?

Not imminently.

 Note that this wouldn't entirely fix the fmtId problem, as not all the
 uses of fmtId are directly in sprintf calls.  Still, it might get rid of
 most of the places where it'd be painful to avoid a memory leak with
 a strdup'ing version of fmtId.

Yeah, I didn't think about that.  Might be worth a look to see how
comprehensively it would solve the problem.  But I'll have to leave
that for another day.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-01-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jan 8, 2013 at 7:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 ...  And I don't especially like the idea of trying to
 make it depend directly on the box's physical RAM, for the same
 practical reasons Robert mentioned.

 For the record, I don't believe those problems would be particularly
 hard to solve.

Well, the problem of find out the box's physical RAM is doubtless
solvable if we're willing to put enough sweat and tears into it, but
I'm dubious that it's worth the trouble.  The harder part is how to know
if the box is supposed to be dedicated to the database.  Bear in mind
that the starting point of this debate was the idea that we're talking
about an inexperienced DBA who doesn't know about any configuration knob
we might provide for the purpose.

I'd prefer to go with a default that's predictable and not totally
foolish --- and some multiple of shared_buffers seems like it'd fit the
bill.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump transaction's read-only mode

2013-01-08 Thread Gurjeet Singh
On Mon, Dec 31, 2012 at 1:38 AM, Pavan Deolasee pavan.deola...@gmail.comwrote:

 On Sun, Dec 30, 2012 at 12:38 AM, Stephen Frost sfr...@snowman.net
 wrote:
  * Pavan Deolasee (pavan.deola...@gmail.com) wrote:
  On Fri, Sep 7, 2012 at 6:06 PM, Kevin Grittner
   That makes sense to me.  The reason I didn't make that change when I
   added the serializable special case to pg_dump was that it seemed
   like a separate question; I didn't want to complicate an already big
   patch with unnecessary changes to non-serializable transactions.
  
 
  If we agree, should we change that now ?
 
  This is on the next commitfest, so I figure it deserves some comment.
  For my part- I tend to agree that we should have it always use a read
  only transaction.  Perhaps we should update the pg_dump documentation to
  mention this as well though?  Pavan, do you want to put together an
  actual patch?
 

 I'd posted actual patch on this thread, but probably linked wrong
 message-id in the commitfest page. Will check and correct. Regarding
 pg_dump's documentation, I don't have strong views on that. Whether
 pg_dump runs as a read-only transaction or not is entirely internal to
 its implementation, but then if we make this change, it might be worth
 telling users that they can trust that pg_dump will not make any
 changes to their database and hence a safe operation to carry out.


I have updated the commitfest submission to link to the correct patch email.

I initially thought that this patch deserves accompanying documentation
because pg_dump's serializable transaction may error out because of a
conflict. But the following line in the docs [1] confirms otherwise:

read-only transactions will never have serialization conflicts

So no doc patch necessary :)

[1] http://www.postgresql.org/docs/9.2/static/transaction-iso.html
-- 
Gurjeet Singh

http://gurjeet.singh.im/


  1   2   >