Re: [HACKERS] Moving more work outside WALInsertLock

2011-12-23 Thread Heikki Linnakangas

On 16.12.2011 15:42, Heikki Linnakangas wrote:

On 16.12.2011 15:03, Simon Riggs wrote:

On Fri, Dec 16, 2011 at 12:50 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:

On 16.12.2011 14:37, Simon Riggs wrote:


I already proposed a design for that using page-level share locks any
reason not to go with that?


Sorry, I must've missed that. Got a link?


From nearly 4 years ago.

http://grokbase.com/t/postgresql.org/pgsql-hackers/2008/02/reworking-wal-locking/145qrhllcqeqlfzntvn7kjefijey



Ah, thanks. That is similar to what I'm experimenting, but a second
lwlock is still fairly heavy-weight. I think with many backends, you
will be beaten badly by contention on the spinlocks alone.

I'll polish up and post what I've been experimenting with, so we can
discuss that.


So, here's a WIP patch of what I've been working on. The WAL insertions 
is split into two stages:


1. Reserve the space from the WAL stream. This is done while holding a 
spinlock. The page holding the reserved space doesn't necessary need to 
be in cache yet, the reservation can run ahead of the WAL buffer cache. 
(quick testing suggests that a lwlock is too heavy-weight for this)


2. Ensure the page is in the WAL buffer cache. If not, initialize it, 
evicting old pages if needed. Then finish the CRC calculation of the 
header and memcpy the record in place. (if the record spans multiple 
pages, it operates on one page at a time, to avoid problems with running 
out of WAL buffers)


As long as wal_buffers is high enough, and the I/O can keep up, stage 2 
can happen in parallel in many backends. The WAL writer process 
pre-initializes new pages ahead of the insertions, so regular backends 
rarely need to do that.


When a page is written out, with XLogWrite(), you need to wait for any 
in-progress insertions to the pages you're about to write out to finish. 
For that, every backend has slot with an XLogRecPtr in shared memory. 
Iẗ́'s set to the position where that backend is currently inserting to. 
If there's no insertion in-progress, it's invalid, but when it's valid 
it acts like a barrier, so that no-one is allowed to XLogWrite() beyond 
that position. That's very lightweight to the backends, but I'm using 
busy-waiting to wait on an insertion to finish ATM. That should be 
replaced with something smarter, that's the biggest missing part of the 
patch.


One simple way to test the performance impact of this is:

psql -c DROP TABLE IF EXISTS foo; CREATE TABLE foo (id int4); 
CHECKPOINT postgres
echo BEGIN; INSERT INTO foo SELECT i FROM generate_series(1, 1) i; 
ROLLBACK  parallel-insert-test.sql

pgbench -n -T 10 -c4 -f parallel-insert-test.sql postgres

On my dual-core laptop, this patch increases the tps on that from about 
60 to 110.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Real-life range datasets

2011-12-23 Thread Alexander Korotkov
Hello,

On Thu, Dec 22, 2011 at 12:51 PM, Benedikt Grundmann 
bgrundm...@janestreet.com wrote:

 I should be able to give you a table with the same characteristics as
 the instruments table but bogus data by replacing all entries in the
 table with random strings of the same length or something like that.
 I can probably take a little bit of time during this or the next week
 to generate such fake real world data ;-)   Is there an ftp site to
 upload the gzipped pg_dump file to?


Thank you very much for your response! I'm going to send you accessories
for upload soon.

-
With best regards,
Alexander Korotkov.


Re: [HACKERS] Extensions and 9.2

2011-12-23 Thread Daniel Farina
On Wed, Dec 21, 2011 at 5:46 AM, Robert Haas robertmh...@gmail.com wrote:
 Assuming the command in
 question can be stuffed inside a function, the most you're gaining is
 a little notational convenience

I can answer that one (why a full-blown mechanism for a notational convenience).

It has occurred to me to use this mechanism to support extensions, but
I find the prospect of having to teach people special operators to
understand how to use the standard extension feature an unstomachable
prospect.  The single biggest problem is that pg_restore will not know
to call this function rather than run CREATE EXTENSION, and then two
databases, prepared exactly the same cannot be pg_dump-ed and restored
in a reasonable way.  If there's a definable whitelist, then this
vital functionality will work.

There are other sicknesses imposed if one has to hack up how to
delegate extension creation to non-superusers:

* The postgres manual, wiki, and ecosystem of recipes on the web and
internet at large basically are not going to work without
modification.  Death by a thousand cuts.

* \df in psql displays some new operators that you aren't familiar
with.  Also, putting aside your pg_dump/pg_restore generated SQL will
not work, they will look funny.  This is an eyesore.

* If one tells someone else yeah, the system supports extensions,
they're going to type CREATE EXTENSION.  And then it's not going to
work, and then they're either going to give up, yak shave for a few
hours figuring out what they were supposed to do for their provider
or organization, or maybe worst of all hack their way around
functionality the extension could have provided in a cleaner way had
it just worked nice and tidy to begin with.

I find this functionality basically vital because it greatly decreases
the friction between people, organizations, and software in the domain
of deploying, reasoning, and communicating about the installation and
status of Postgres extensions in a database.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making TEXT NUL-transparent

2011-12-23 Thread Florian Weimer
* Florian Pflug:

 On Nov24, 2011, at 10:54 , Florian Weimer wrote:
 Or is it not only about being able to *store* NULs in a text field?
 
 No, the entire core should be NUL-transparent.

 That's unlikely to happen.

Yes, with the type input/output functions tied to NUL-terminated
strings, that seems indeed unlikely to happen.

 A more realistic approach would be to solve this only for UTF-8
 encoded strings by encoding the NUL character not as a single 0 byte,
 but as sequence of non-0 bytes.

0xFF cannot occur in valid UTF-8, so that's one possibility.

 Java, for example, seems to use it to serialize Strings (which may contain
 NUL characters) to UTF-8.

Only internally in the VM.  UTF-8 produced by the I/O encoder/decoders
produces and consumes NUL bytes.

 Should you try to add a new encoding which supports that, you might also
 want to allow CESU-8-style encoding of UTF-16 surrogate pairs. This means
 that code points representable by UTF-16 surrogate pairs may be encoded by
 separately encoding the two surrogate characters in UTF-8.

I'm not sure if this is a good idea.  The motivation behind CESU-8 is
that it sorts byte-encoded strings in the same order as UTF-16, which is
a completely separate concern.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-12-23 Thread Kohei KaiGai
2011/12/22 Robert Haas robertmh...@gmail.com:
 On Mon, Dec 12, 2011 at 12:00 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 The v8.option-2 add checks around examine_simple_variable, and
 prevent to reference statistical data, if Var node tries to reference
 relation with security-barrier attribute.

 I adopted this approach, and committed this.

Thanks for your help and efforts.

I'd like the regression test on select_view test being committed also
to detect unexpected changed in the future. How about it?

Best regards,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reprise: pretty print viewdefs

2011-12-23 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 I *still* spend a lot of time editing in a 25x80 window.

80 is a good choice whatever the screen size, because it's about the
most efficient text width as far as eyes movements are concerned:  the
eye is much better at going top-bottom than left-right.  That's also why
printed papers still pick shorter columns and website design often do,
too.  If it's physically tiring to read your text, very few people will.

Now, 25 lines… maybe you should tweak your terminal setups, or consider
editing in a more comfortable environment :)

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog location arithmetic

2011-12-23 Thread Magnus Hagander
On Tue, Dec 20, 2011 at 14:08, Euler Taveira de Oliveira
eu...@timbira.com wrote:
 On 20-12-2011 07:27, Magnus Hagander wrote:
 On Tue, Dec 6, 2011 at 19:06, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Dec 6, 2011 at 1:00 PM, Euler Taveira de Oliveira
 eu...@timbira.com wrote:
 On 06-12-2011 13:11, Robert Haas wrote:
 On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander mag...@hagander.net 
 wrote:
 I've been considering similar things, as you can find in the archives,
 but what I was thinking of was converting the number to just a plain
 bigint, then letting the user apply whatever arithmetic wanted at the
 SQL level. I never got around to acutally coding it, though. It could
 easily be extracted from your patch of course - and I think that's a
 more flexible approach. Is there some advantage to your method that
 I'm missing?

 I went so far as to put together an lsn data type.  I didn't actually
 get all that far with it, which is why I haven't posted it sooner, but
 here's what I came up with.  It's missing indexing support and stuff,
 but that could be added if people like the approach.  It solves this
 problem by implementing -(lsn,lsn) = numeric (not int8, that can
 overflow since it is not unsigned), which allows an lsn = numeric
 conversion by just subtracting '0/0'::lsn.

 Interesting approach. I don't want to go that far. If so, you want to 
 change
 all of those functions that deal with LSNs and add some implicit conversion
 between text and lsn data types (for backward compatibility). As of int8, 
 I'm

 As long as you have the conversion, you don't really need to change
 them, do you? It might be nice in some ways, but this is still a
 pretty internal operation, so I don't see it as critical.

 For correctness, yes.

 At this point, my question is: do we want to support the lsn data type idea or
 a basic function that implements the difference between LSNs?

Personally I think a function is enough - it solves the only case that
I've actually seen. But a datatype would be a more complete solution,
of course - but it seems a bit of an overkill to me. Not really sure
which way we should go - I was hoping somebody else would comment as
well..

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Another hstore_type idea

2011-12-23 Thread Marc Mamin
Hello,

after reading the thread on Typed hstore proposal, I wonder if another
minded extension of hstore would be benefical:
add additional hstore types with numerical data type for the values.

e.g.:  hstore_float : text - float

This should allow to add some nice aggregation function on these hstore,
e.g.

select distinct_sum(x) from
(
  a - 1   b - 5
  a - 3   c - 2
)

=

a - 4  b - 5  c - 2


I have a small case where I'm doing this with a custom type (text,
float) , but with poor performances.
I guess that such an extension would make sense if it were to bring a
significant performance gain compared to the custom type approach.

best regards and Merry Christmas,

Marc Mamin


Re: [HACKERS] Moving more work outside WALInsertLock

2011-12-23 Thread Robert Haas
On Fri, Dec 23, 2011 at 3:15 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 23.12.2011 10:13, Heikki Linnakangas wrote:
 So, here's a WIP patch of what I've been working on.

 And here's the patch I forgot to attach..

Fails regression tests for me.  I found this in postmaster.log:

PANIC:  could not find WAL buffer for 0/2ECA438STATEMENT:  ANALYZE onek2;
LOG:  stuck waiting upto 0/300
LOG:  server process (PID 34529) was terminated by signal 6: Aborted
DETAIL:  Failed process was running: ANALYZE onek2;

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-12-23 Thread Robert Haas
On Fri, Dec 23, 2011 at 5:56 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 I'd like the regression test on select_view test being committed also
 to detect unexpected changed in the future. How about it?

Can you resend that as a separate patch?  I remember there were some
things I didn't like about it, but I don't remember what they were at
the moment...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: bytea_agg

2011-12-23 Thread Robert Haas
On Thu, Dec 22, 2011 at 11:49 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Dec 21, 2011 at 5:04 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 this patch adds a bytea_agg aggregation.

 It allow fast bytea concatetation.

 Looks fine to me.  I'll commit this, barring objections.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: explain analyze with 'rows' but not timing

2011-12-23 Thread Robert Haas
2011/12/22 Tomas Vondra t...@fuzzy.cz:
 The gettimeofday() calls are not exactly cheap in some cases, so why to
 pay that price when all you need is the number of rows?

Fair point.

 The patch attached does this:

 1) adds INSTRUMENT_ROWS, a new InstrumentOption

   - counts rows without timing (no gettimeofday() callse)
   - if you want timing info, use INSTRUMENT_TIMER

 2) adds new option TIMING to EXPLAIN, i.e.

    EXPLAIN (ANALYZE ON, TIMING ON) SELECT ...

 3) adds auto_explain.log_rows_only (false by default)

   - if you set this to 'true', then the instrumentation will just
     count rows, without calling gettimeofday()

This seems like an unnecessarily confusing interface, because you've
named the auto_explain option differently from the EXPLAIN option and
given it (almost) the opposite sense: timing=off means the same thing
as log_rows_only=on.

I think the EXPLAIN (TIMING) option is good the way you have it, but
then just have auto_explain.log_timing, with a default value of on.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: explain analyze with 'rows' but not timing

2011-12-23 Thread Tomas Vondra
On 23.12.2011 14:57, Robert Haas wrote:
 2011/12/22 Tomas Vondra t...@fuzzy.cz:
 The gettimeofday() calls are not exactly cheap in some cases, so why to
 pay that price when all you need is the number of rows?
 
 Fair point.
 
 The patch attached does this:

 1) adds INSTRUMENT_ROWS, a new InstrumentOption

   - counts rows without timing (no gettimeofday() callse)
   - if you want timing info, use INSTRUMENT_TIMER

 2) adds new option TIMING to EXPLAIN, i.e.

EXPLAIN (ANALYZE ON, TIMING ON) SELECT ...

 3) adds auto_explain.log_rows_only (false by default)

   - if you set this to 'true', then the instrumentation will just
 count rows, without calling gettimeofday()
 
 This seems like an unnecessarily confusing interface, because you've
 named the auto_explain option differently from the EXPLAIN option and
 given it (almost) the opposite sense: timing=off means the same thing
 as log_rows_only=on.
 
 I think the EXPLAIN (TIMING) option is good the way you have it, but
 then just have auto_explain.log_timing, with a default value of on.

Makes sense. I've updated the patch to reflect this, so the option is
called auto_explain.log_timing and is true by default. I'll add the
patch to the next commit fest.

One thing I'm wondering about is that the InstrumentOptions are not
exclusive - INSTRUMENT_TIMER means 'collect timing and row counts' while
INSTRUMENT_ROWS means 'collect row counts'. Wouldn't it be better to
redefine the INSTRUMENT_TIMER so that it collects just timing info. I.e.
to get the current behaviour, you'd have to do this

   instrument_options |= (INSTRUMENT_TIMER | INSTRUMENT_ROWS)

It's quite trivial change in explain.c, the problem I have with that is
that it might break extensions.

Tomas
diff --git a/contrib/auto_explain/auto_explain.c 
b/contrib/auto_explain/auto_explain.c
index b320698..4b52b26 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -23,6 +23,7 @@ static intauto_explain_log_min_duration = -1; /* msec or 
-1 */
 static bool auto_explain_log_analyze = false;
 static bool auto_explain_log_verbose = false;
 static bool auto_explain_log_buffers = false;
+static bool auto_explain_log_timing  = false;
 static int auto_explain_log_format = EXPLAIN_FORMAT_TEXT;
 static bool auto_explain_log_nested_statements = false;
 
@@ -133,6 +134,17 @@ _PG_init(void)
 NULL,
 NULL);
 
+   DefineCustomBoolVariable(auto_explain.log_timing,
+Collect timing data, 
not just row counts.,
+NULL,
+
auto_explain_log_timing,
+true,
+PGC_SUSET,
+0,
+NULL,
+NULL,
+NULL);
+
EmitWarningsOnPlaceholders(auto_explain);
 
/* Install hooks. */
@@ -170,7 +182,12 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
/* Enable per-node instrumentation iff log_analyze is required. 
*/
if (auto_explain_log_analyze  (eflags  
EXEC_FLAG_EXPLAIN_ONLY) == 0)
{
-   queryDesc-instrument_options |= INSTRUMENT_TIMER;
+   if (auto_explain_log_timing)
+   queryDesc-instrument_options |= 
INSTRUMENT_TIMER;
+   else
+   queryDesc-instrument_options |= 
INSTRUMENT_ROWS;
+
+
if (auto_explain_log_buffers)
queryDesc-instrument_options |= 
INSTRUMENT_BUFFERS;
}
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index 8a9c9de..4488956 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -40,6 +40,7 @@ EXPLAIN [ ANALYZE ] [ VERBOSE ] replaceable 
class=parameterstatement/replac
 VERBOSE [ replaceable class=parameterboolean/replaceable ]
 COSTS [ replaceable class=parameterboolean/replaceable ]
 BUFFERS [ replaceable class=parameterboolean/replaceable ]
+TIMING [ replaceable class=parameterboolean/replaceable ]
 FORMAT { TEXT | XML | JSON | YAML }
 /synopsis
  /refsynopsisdiv
@@ -172,6 +173,20 @@ ROLLBACK;
/varlistentry
 
varlistentry
+termliteralTIMING/literal/term
+listitem
+ para
+  Include information on timing for each node. Specifically, include the
+  actual startup time and time spent in the node. This may involve a lot
+  of gettimeofday calls, and on some systems this means significant
+  slowdown of the 

Re: [HACKERS] xlog location arithmetic

2011-12-23 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 At this point, my question is: do we want to support the lsn data type idea 
 or
 a basic function that implements the difference between LSNs?

 Personally I think a function is enough - it solves the only case that
 I've actually seen. But a datatype would be a more complete solution,
 of course - but it seems a bit of an overkill to me. Not really sure
 which way we should go - I was hoping somebody else would comment as
 well..

I too think a datatype is overkill, if we're only planning on providing
one function.  Just emit the values as numeric and have done.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog location arithmetic

2011-12-23 Thread Robert Haas
On Fri, Dec 23, 2011 at 10:05 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I too think a datatype is overkill, if we're only planning on providing
 one function.

Are there any other functions we ought to provide?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog location arithmetic

2011-12-23 Thread Andrew Dunstan



On 12/23/2011 10:05 AM, Tom Lane wrote:

Magnus Hagandermag...@hagander.net  writes:

At this point, my question is: do we want to support the lsn data type idea or
a basic function that implements the difference between LSNs?

Personally I think a function is enough - it solves the only case that
I've actually seen. But a datatype would be a more complete solution,
of course - but it seems a bit of an overkill to me. Not really sure
which way we should go - I was hoping somebody else would comment as
well..

I too think a datatype is overkill, if we're only planning on providing
one function.  Just emit the values as numeric and have done.




+1.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: explain analyze with 'rows' but not timing

2011-12-23 Thread Tom Lane
Tomas Vondra t...@fuzzy.cz writes:
 One thing I'm wondering about is that the InstrumentOptions are not
 exclusive - INSTRUMENT_TIMER means 'collect timing and row counts' while
 INSTRUMENT_ROWS means 'collect row counts'. Wouldn't it be better to
 redefine the INSTRUMENT_TIMER so that it collects just timing info. I.e.
 to get the current behaviour, you'd have to do this

instrument_options |= (INSTRUMENT_TIMER | INSTRUMENT_ROWS)

 It's quite trivial change in explain.c, the problem I have with that is
 that it might break extensions.

I'm not especially concerned by that angle --- we make bigger API
changes all the time.  But you could change the name, eg

instrument_options |= (INSTRUMENT_TIMING | INSTRUMENT_ROWS)

and then #define INSTRUMENT_TIMER as the OR of the two real bits
for backward compatibility.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog location arithmetic

2011-12-23 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Dec 23, 2011 at 10:05 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I too think a datatype is overkill, if we're only planning on providing
 one function.

 Are there any other functions we ought to provide?

Even if there are several, what exact advantage does a datatype offer
over representing LSN values as numerics?  It seems to me to be adding
complication and extra code (I/O converters at least) for very little
gain.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Non-inheritable check constraints

2011-12-23 Thread Robert Haas
On Thu, Dec 22, 2011 at 10:54 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 I agree with Robert that this usage of ALTER TABLE ONLY is slightly
 different from other usages of the same command, but I disagree that
 this means that we need another command to do what we want to do here.
 IOW, I prefer to keep the syntax we have.

Another disadvantage of the current syntax becomes evident when you
look at the pg_dump output.  If you pg_dump a regular constraint, the
constraint gets added as part of the table definition, and the rows
are all checked as they are inserted.  If you pg_dump an ONLY
constraint, the constraint gets added after loading the data,
requiring an additional full-table scan to validate it.

  I am tempted to say we should revert this and rethink.  I don't
  believe we are only a small patch away from finding all the bugs here.

 Sure, if we all think that CREATE TABLE should support ONLY CONSTRAINT type
 of syntax, then +1 for reverting this and a subsequent revised submission.

 I don't think this is a given ...  In fact, IMO if we're only two or
 three fixes away from having it all nice and consistent, I think
 reverting is not necessary.

Sure.  It's the if part of that sentence that I'm not too sure about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog location arithmetic

2011-12-23 Thread Robert Haas
On Fri, Dec 23, 2011 at 10:18 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Dec 23, 2011 at 10:05 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I too think a datatype is overkill, if we're only planning on providing
 one function.

 Are there any other functions we ought to provide?

 Even if there are several, what exact advantage does a datatype offer
 over representing LSN values as numerics?  It seems to me to be adding
 complication and extra code (I/O converters at least) for very little
 gain.

I guess I'm just constitutionally averse to labeling things as text
when they really aren't.  I do it all the time in Perl, of course, but
in PostgreSQL we have strong data typing, and it seems like we might
as well use it.

Also, we've occasionally talked (in the light of Pavan's single-pass
vacuum patch, for example) about needing to store LSNs in system
catalogs; and we're certainly not going to want to do that as text.
I'll admit that it's not 100% clear that anything like this will ever
happen, though, so maybe it's premature to worry about it.

I can see I'm in the minority on this one, though...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog location arithmetic

2011-12-23 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Dec 23, 2011 at 10:18 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Even if there are several, what exact advantage does a datatype offer
 over representing LSN values as numerics?  It seems to me to be adding
 complication and extra code (I/O converters at least) for very little
 gain.

 I guess I'm just constitutionally averse to labeling things as text
 when they really aren't.

Er ... text?  I thought the proposal was to use numeric.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums + Double Writes

2011-12-23 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 I would suggest you examine how to have an array of N bgwriters,
 then just slot the code for hinting into the bgwriter. That way a
 bgwriter can set hints, calc CRC and write pages in sequence on a
 particular block. The hinting needs to be synchronised with the
 writing to give good benefit.
 
 I'll think about that.  I see pros and cons, and I'll have to see
 how those balance out after I mull them over.
 
I think maybe the best solution is to create some common code to use
from both.  The problem with *just* doing it in bgwriter is that it
would not help much with workloads like Robert has been using for
most of his performance testing -- a database which fits entirely in
shared buffers and starts thrashing on CLOG.  For a background
hinter process my goal would be to deal with xids as they are passed
by the global xmin value, so that you have a cheap way to know that
they are ripe for hinting, and you can frequently hint a bunch of
transactions that are all in the same CLOG page which is recent
enough to likely be already loaded.
 
Now, a background hinter isn't going to be a net win if it has to
grovel through every tuple on every dirty page every time it sweeps
through the buffers, so the idea depends on having a sufficiently
efficient was to identify interesting buffers.  I'm hoping to
improve on this, but my best idea so far is to add a field to the
buffer header for earliest unhinted xid for the page.  Whenever
this background process wakes up and is scanning through the buffers
(probably just in buffer number order), it does a quick check,
without any pin or lock, to see if the buffer is dirty and the
earliest unhinted xid is below the global xmin.  If it passes both
of those tests, there is definitely useful work which can be done if
the page doesn't get evicted before we can do it.  We pin the page,
recheck those conditions, and then we look at each tuple and hint
where possible.  As we go, we remember the earliest xid that we see
which is *not* being hinted, to store back into the buffer header
when we're done.  Of course, we would also update the buffer header
for new tuples or when an xmax is set if the xid involved precedes
what we have in the buffer header.
 
This would not only help avoid multiple page writes as unhinted
tuples on the page are read, it would minimize thrashing on CLOG and
move some of the hinting work from the critical path of reading a
tuple into a background process.
 
Thoughts?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog location arithmetic

2011-12-23 Thread Robert Haas
On Fri, Dec 23, 2011 at 10:59 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Dec 23, 2011 at 10:18 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Even if there are several, what exact advantage does a datatype offer
 over representing LSN values as numerics?  It seems to me to be adding
 complication and extra code (I/O converters at least) for very little
 gain.

 I guess I'm just constitutionally averse to labeling things as text
 when they really aren't.

 Er ... text?  I thought the proposal was to use numeric.

The proposal is to make a function that takes a text argument (which
is really an LSN, but we choose to represent it as text) and returns
numeric.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: explain analyze with 'rows' but not timing

2011-12-23 Thread Alvaro Herrera

Excerpts from Tomas Vondra's message of vie dic 23 11:45:40 -0300 2011:

 One thing I'm wondering about is that the InstrumentOptions are not
 exclusive - INSTRUMENT_TIMER means 'collect timing and row counts' while
 INSTRUMENT_ROWS means 'collect row counts'. Wouldn't it be better to
 redefine the INSTRUMENT_TIMER so that it collects just timing info. I.e.
 to get the current behaviour, you'd have to do this
 
instrument_options |= (INSTRUMENT_TIMER | INSTRUMENT_ROWS)
 
 It's quite trivial change in explain.c, the problem I have with that is
 that it might break extensions.

No, maybe add INSTRUMENT_TIMER_ONLY and then define INSTRUMENT_TIMER as
setting both.  That way you don't break compatibility.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums + Double Writes

2011-12-23 Thread Robert Haas
On Fri, Dec 23, 2011 at 11:14 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Thoughts?

Those are good thoughts.

Here's another random idea, which might be completely nuts.  Maybe we
could consider some kind of summarization of CLOG data, based on the
idea that most transactions commit.  We introduce the idea of a CLOG
rollup page.  On a CLOG rollup page, each bit represents the status of
N consecutive XIDs.  If the bit is set, that means all XIDs in that
group are known to have committed.  If it's clear, then we don't know,
and must fall through to a regular CLOG lookup.

If you let N = 1024, then 8K of CLOG rollup data is enough to
represent the status of 64 million transactions, which means that just
a couple of pages could cover as much of the XID space as you probably
need to care about.  Also, you would need to replace CLOG summary
pages in memory only very infrequently.  Backends could test the bit
without any lock.  If it's set, they do pg_read_barrier(), and then
check the buffer label to make sure it's still the summary page they
were expecting.  If so, no CLOG lookup is needed.  If the page has
changed under us or the bit is clear, then we fall through to a
regular CLOG lookup.

An obvious problem is that, if the abort rate is significantly
different from zero, and especially if the aborts are randomly mixed
in with commits rather than clustered together in small portions of
the XID space, the CLOG rollup data would become useless.  On the
other hand, if you're doing 10k tps, you only need to have a window of
a tenth of a second or so where everything commits in order to start
getting some benefit, which doesn't seem like a stretch.

Perhaps the CLOG rollup data wouldn't even need to be kept on disk.
We could simply have bgwriter (or bghinter) set the rollup bits in
shared memory for new transactions, as it becomes possible to do so,
and let lookups for XIDs prior to the last shutdown fall through to
CLOG.  Or, if that's not appealing, we could reconstruct the data in
memory by groveling through the CLOG pages - or maybe just set summary
bits only for CLOG pages that actually get faulted in.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums + Double Writes

2011-12-23 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 An obvious problem is that, if the abort rate is significantly
 different from zero, and especially if the aborts are randomly mixed
 in with commits rather than clustered together in small portions of
 the XID space, the CLOG rollup data would become useless.

Yeah, I'm afraid that with N large enough to provide useful
acceleration, the cases where you'd actually get a win would be too thin
on the ground to make it worth the trouble.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: bytea_agg

2011-12-23 Thread Peter Eisentraut
On ons, 2011-12-21 at 11:04 +0100, Pavel Stehule wrote:
 this patch adds a bytea_agg aggregation.
 
 It allow fast bytea concatetation.

Why not call it string_agg?  All the function names are the same between
text and bytea (e.g., ||, substr, position, length).  It would be nice
not to introduce arbitrary differences.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: bytea_agg

2011-12-23 Thread Pavel Stehule
Hello

2011/12/23 Peter Eisentraut pete...@gmx.net:
 On ons, 2011-12-21 at 11:04 +0100, Pavel Stehule wrote:
 this patch adds a bytea_agg aggregation.

 It allow fast bytea concatetation.

 Why not call it string_agg?  All the function names are the same between
 text and bytea (e.g., ||, substr, position, length).  It would be nice
 not to introduce arbitrary differences.

My opinion is not too strong. I don't think so using string_agg is
good name (for bytea_agg) - as minimal (and only one) reason is
different API - there is no support for delimiter. If I remember well
discussion about string_agg, where delimiter is not optimal, there is
request for immutable interface for aggregates - there was a issue
with ORDER clause. So bytea_agg is good name.

Regards

Pavel



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: bytea_agg

2011-12-23 Thread Pavel Stehule
Hello

2011/12/23 Peter Eisentraut pete...@gmx.net:
 On ons, 2011-12-21 at 11:04 +0100, Pavel Stehule wrote:
 this patch adds a bytea_agg aggregation.

 It allow fast bytea concatetation.

 Why not call it string_agg?  All the function names are the same between
 text and bytea (e.g., ||, substr, position, length).  It would be nice
 not to introduce arbitrary differences.

My opinion is not strong. I don't think so using string_agg is good
name (- as minimal (and only one) reason is different API - there is
no support for delimiter. If I remember well discussion about
string_agg, where delimiter is not optimal, there is request for
immutable interface for aggregates - there was a issue with ORDER
clause. So bytea_agg is good name.

Regards

Pavel



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: bytea_agg

2011-12-23 Thread Robert Haas
On Fri, Dec 23, 2011 at 12:51 PM, Peter Eisentraut pete...@gmx.net wrote:
 On ons, 2011-12-21 at 11:04 +0100, Pavel Stehule wrote:
 this patch adds a bytea_agg aggregation.

 It allow fast bytea concatetation.

 Why not call it string_agg?  All the function names are the same between
 text and bytea (e.g., ||, substr, position, length).  It would be nice
 not to introduce arbitrary differences.

Well, because it doesn't operate on strings.

I argued when we added string_agg that it ought to be called
concat_agg, or something like that, but I got shouted down.  So now
here we are.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums + Double Writes

2011-12-23 Thread Robert Haas
On Fri, Dec 23, 2011 at 12:42 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 An obvious problem is that, if the abort rate is significantly
 different from zero, and especially if the aborts are randomly mixed
 in with commits rather than clustered together in small portions of
 the XID space, the CLOG rollup data would become useless.

 Yeah, I'm afraid that with N large enough to provide useful
 acceleration, the cases where you'd actually get a win would be too thin
 on the ground to make it worth the trouble.

Well, I don't know: something like pgbench is certainly going to
benefit, because all the transactions commit.  I suspect that's true
for many benchmarks.  Whether it's true of real-life workloads is more
arguable, of course, but if the benchmarks aren't measuring things
that people really do with the database, then why are they designed
the way they are?

I've certainly written applications that relied on the database for
integrity checking, so rollbacks were an expected occurrence, but then
again those were very low-velocity systems where there wasn't going to
be enough CLOG contention to matter anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: explain analyze with 'rows' but not timing

2011-12-23 Thread Tomas Vondra
On 23.12.2011 16:14, Tom Lane wrote:
 Tomas Vondra t...@fuzzy.cz writes:
 One thing I'm wondering about is that the InstrumentOptions are not
 exclusive - INSTRUMENT_TIMER means 'collect timing and row counts' while
 INSTRUMENT_ROWS means 'collect row counts'. Wouldn't it be better to
 redefine the INSTRUMENT_TIMER so that it collects just timing info. I.e.
 to get the current behaviour, you'd have to do this
 
instrument_options |= (INSTRUMENT_TIMER | INSTRUMENT_ROWS)
 
 It's quite trivial change in explain.c, the problem I have with that is
 that it might break extensions.
 
 I'm not especially concerned by that angle --- we make bigger API
 changes all the time.  But you could change the name, eg
 
   instrument_options |= (INSTRUMENT_TIMING | INSTRUMENT_ROWS)
 
 and then #define INSTRUMENT_TIMER as the OR of the two real bits
 for backward compatibility.

OK, that seems like a good solution. But is it worth the additional
complexity in explain.c?

The motivation for this patch was that collection timing data often
causes performance issues and in some cases it's not needed. But is this
true for row counts?

Are there machines where collecting row counts is above noise level?
I've never seen that, but that's not a proof of nonexistence.

If the overhead of this is negligible, then I could just hide the row
counts from the output.

Tomas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another hstore_type idea

2011-12-23 Thread Christopher Browne
On Fri, Dec 23, 2011 at 7:06 AM, Marc Mamin m.ma...@intershop.de wrote:
 after reading the thread on Typed hstore proposal, I wonder if another
 minded extension of hstore would be benefical:

 add additional hstore types with numerical data type for the values.

I would expect the primary *performance* value in an hstore
extension to come from things that allow accessing data without
needing to unbox it.
(I remember the concept of unboxing from APL; it seems to have been
subsumed by object oriented terminology...
http://en.wikipedia.org/wiki/Object_type_%28object-oriented_programming%29#Unboxing)

The big win comes not as much from type matching (which seems to me
like a morass, as you'll need the zillion Postgres types to cover all
the cases) as it comes from avoiding the need to take the blobs of
tuple data and re-parse them.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: bytea_agg

2011-12-23 Thread Peter Eisentraut
On fre, 2011-12-23 at 13:30 -0500, Robert Haas wrote:
 On Fri, Dec 23, 2011 at 12:51 PM, Peter Eisentraut pete...@gmx.net wrote:
  On ons, 2011-12-21 at 11:04 +0100, Pavel Stehule wrote:
  this patch adds a bytea_agg aggregation.
 
  It allow fast bytea concatetation.
 
  Why not call it string_agg?  All the function names are the same between
  text and bytea (e.g., ||, substr, position, length).  It would be nice
  not to introduce arbitrary differences.
 
 Well, because it doesn't operate on strings.

Sure, binary strings.  Both the SQL standard and the PostgreSQL
documentation use that term.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: bytea_agg

2011-12-23 Thread Robert Haas
On Fri, Dec 23, 2011 at 2:16 PM, Peter Eisentraut pete...@gmx.net wrote:
 On fre, 2011-12-23 at 13:30 -0500, Robert Haas wrote:
 On Fri, Dec 23, 2011 at 12:51 PM, Peter Eisentraut pete...@gmx.net wrote:
  On ons, 2011-12-21 at 11:04 +0100, Pavel Stehule wrote:
  this patch adds a bytea_agg aggregation.
 
  It allow fast bytea concatetation.
 
  Why not call it string_agg?  All the function names are the same between
  text and bytea (e.g., ||, substr, position, length).  It would be nice
  not to introduce arbitrary differences.

 Well, because it doesn't operate on strings.

 Sure, binary strings.  Both the SQL standard and the PostgreSQL
 documentation use that term.

I'm unimpressed by that argument, but let's see what other people think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: bytea_agg

2011-12-23 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 Peter Eisentraut pete...@gmx.net wrote:
 Robert Haas wrote:
 Peter Eisentraut pete...@gmx.net wrote:
 On ons, 2011-12-21 at 11:04 +0100, Pavel Stehule wrote:
 this patch adds a bytea_agg aggregation.

 It allow fast bytea concatetation.

 Why not call it string_agg?  All the function names are the
 same between text and bytea (e.g., ||, substr, position,
 length).  It would be nice not to introduce arbitrary
 differences.

 Well, because it doesn't operate on strings.

 Sure, binary strings.  Both the SQL standard and the PostgreSQL
 documentation use that term.
 
 I'm unimpressed by that argument, but let's see what other people
 think.
 
I, for one, try to be consistent about saying character strings
when that is what I mean.  Since at least the SQL-92 standard there
have been both character strings and bit strings, with a certain
amount of symmetry in how they are handled.   I don't remember when
binary strings were introduced, but that is the standard
terminology.  There is, for example, a standard substring function
for binary strings.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums + Double Writes

2011-12-23 Thread Jeff Janes
On 12/23/11, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Dec 23, 2011 at 11:14 AM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Thoughts?

 Those are good thoughts.

 Here's another random idea, which might be completely nuts.  Maybe we
 could consider some kind of summarization of CLOG data, based on the
 idea that most transactions commit.

I had a perhaps crazier idea. Aren't CLOG pages older than global xmin
effectively read only?  Could backends that need these bypass locking
and shared memory altogether?

 An obvious problem is that, if the abort rate is significantly
 different from zero, and especially if the aborts are randomly mixed
 in with commits rather than clustered together in small portions of
 the XID space, the CLOG rollup data would become useless.  On the
 other hand, if you're doing 10k tps, you only need to have a window of
 a tenth of a second or so where everything commits in order to start
 getting some benefit, which doesn't seem like a stretch.

Could we get some major OLTP users to post their CLOG for analysis?  I
wouldn't think there would be much security/propietary issues with
CLOG data.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums + Double Writes

2011-12-23 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 An obvious problem is that, if the abort rate is significantly
 different from zero, and especially if the aborts are randomly
 mixed in with commits rather than clustered together in small
 portions of the XID space, the CLOG rollup data would become
 useless.
 
 Yeah, I'm afraid that with N large enough to provide useful
 acceleration, the cases where you'd actually get a win would be
 too thin on the ground to make it worth the trouble.
 
Just to get a real-life data point, I check the pg_clog directory
for Milwaukee County Circuit Courts.  They have about 300 OLTP
users, plus replication feeds to the central servers.  Looking at
the now-present files, there are 19,104 blocks of 256 bytes (which
should support N of 1024, per Robert's example).  Of those, 12,644
(just over 66%) contain 256 bytes of hex 55.
 
Last modified dates on the files go back to the 4th of October, so
this represents roughly three months worth of real-life
transactions.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums + Double Writes

2011-12-23 Thread Kevin Grittner
Jeff Janes jeff.ja...@gmail.com wrote:
 
 Could we get some major OLTP users to post their CLOG for
 analysis?  I wouldn't think there would be much
 security/propietary issues with CLOG data.
 
FWIW, I got the raw numbers to do my quick check using this Ruby
script (put together for me by Peter Brant).  If it is of any use to
anyone else, feel free to use it and/or post any enhanced versions
of it.
 
#!/usr/bin/env ruby

Dir.glob(*) do |file_name|
  contents = File.read(file_name)
  total = 
contents.enum_for(:each_byte).enum_for(:each_slice,
256).inject(0) do |count, chunk|
  if chunk.all? { |b| b == 0x55 }
count + 1
  else
count
  end
end
  printf %s %d\n, file_name, total
end
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums + Double Writes

2011-12-23 Thread Tom Lane
Jeff Janes jeff.ja...@gmail.com writes:
 I had a perhaps crazier idea. Aren't CLOG pages older than global xmin
 effectively read only?  Could backends that need these bypass locking
 and shared memory altogether?

Hmm ... once they've been written out from the SLRU arena, yes.  In fact
you don't need to go back as far as global xmin --- *any* valid xmin is
a sufficient boundary point.  The only real problem is to know whether
the data's been written out from the shared area yet.

This idea has potential.  I like it better than Robert's, mainly because
I do not want to see us put something in place that would lead people to
try to avoid rollbacks.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: bytea_agg

2011-12-23 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Dec 23, 2011 at 2:16 PM, Peter Eisentraut pete...@gmx.net wrote:
 On fre, 2011-12-23 at 13:30 -0500, Robert Haas wrote:
 Well, because it doesn't operate on strings.

 Sure, binary strings.  Both the SQL standard and the PostgreSQL
 documentation use that term.

 I'm unimpressed by that argument, but let's see what other people think.

I generally agree with Peter: string_agg makes sense here.  The only
real argument against it is Pavel's point that he didn't include a
delimiter parameter, but that just begs the question why not.  It
seems at least plausible that there would be use-cases for it.

So I think we should try to make this as much like the text case as
possible.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: explain analyze with 'rows' but not timing

2011-12-23 Thread Tom Lane
Tomas Vondra t...@fuzzy.cz writes:
 The motivation for this patch was that collection timing data often
 causes performance issues and in some cases it's not needed. But is this
 true for row counts?

Perhaps more to the point, is there a use case for collecting timing
data without row counts?  I find it hard to visualize a valid reason.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: bytea_agg

2011-12-23 Thread Pavel Stehule
Hello

2011/12/23 Tom Lane t...@sss.pgh.pa.us:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Dec 23, 2011 at 2:16 PM, Peter Eisentraut pete...@gmx.net wrote:
 On fre, 2011-12-23 at 13:30 -0500, Robert Haas wrote:
 Well, because it doesn't operate on strings.

 Sure, binary strings.  Both the SQL standard and the PostgreSQL
 documentation use that term.

 I'm unimpressed by that argument, but let's see what other people think.

 I generally agree with Peter: string_agg makes sense here.  The only
 real argument against it is Pavel's point that he didn't include a
 delimiter parameter, but that just begs the question why not.  It
 seems at least plausible that there would be use-cases for it.

I don't know a real usage for bytea delimiter. Probably there is, but
I expect so most often use case will be without delimiter. And when it
is necessary, then || should be used. I see two ways:

a) use it bytea_agg as it now
b) use a string_agg with delimiter, that will be usually empty.

Using a string_agg for bytea is not too intuitive (but has sense) -
maybe we can introduce a synonym type for bytea - like binary string
or bstring.

Regards

Pavel



 So I think we should try to make this as much like the text case as
 possible.

                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: explain analyze with 'rows' but not timing

2011-12-23 Thread Pavel Stehule
2011/12/23 Tom Lane t...@sss.pgh.pa.us:
 Tomas Vondra t...@fuzzy.cz writes:
 The motivation for this patch was that collection timing data often
 causes performance issues and in some cases it's not needed. But is this
 true for row counts?

 Perhaps more to the point, is there a use case for collecting timing
 data without row counts?  I find it hard to visualize a valid reason.

yes - a searching of bad prediction

Regards

Pavel


                        regards, tom lane

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: bytea_agg

2011-12-23 Thread Kevin Grittner
Pavel Stehule pavel.steh...@gmail.com wrote:
 
 maybe we can introduce a synonym type for bytea - like binary
 string or bstring.
 
The standard mentions these names for binary strings:
 
BINARY, BINARY VARYING, or BINARY LARGE OBJECT
 
which have a certain symmetry with:
 
CHARACTER, CHARACTER VARYING, and CHARACTER
LARGE OBJECT
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: explain analyze with 'rows' but not timing

2011-12-23 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 2011/12/23 Tom Lane t...@sss.pgh.pa.us:
 Tomas Vondra t...@fuzzy.cz writes:
 The motivation for this patch was that collection timing data often
 causes performance issues and in some cases it's not needed. But is this
 true for row counts?

 Perhaps more to the point, is there a use case for collecting timing
 data without row counts?  I find it hard to visualize a valid reason.

 yes - a searching of bad prediction

No, because timing alone proves nothing at all.  The machine could just
have been overloaded.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: explain analyze with 'rows' but not timing

2011-12-23 Thread Tomas Vondra

Dne 23.12.2011 22:37, Pavel Stehule napsal(a):

2011/12/23 Tom Lanet...@sss.pgh.pa.us:

Tomas Vondrat...@fuzzy.cz  writes:

The motivation for this patch was that collection timing data often
causes performance issues and in some cases it's not needed. But is this
true for row counts?


Perhaps more to the point, is there a use case for collecting timing
data without row counts?  I find it hard to visualize a valid reason.


yes - a searching of bad prediction


But that's the purpose of collecting row counts without timing data.

TL is asking about the opposite case - collecting timing data without 
row counts. I can't imagine such use case ...


Tomas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Representation of index clause lists

2011-12-23 Thread Tom Lane
No sooner had I committed e2c2c2e8b1df7dfdb01e7e6f6191a569ce3c3195 than
I started having second thoughts about the choice of representation.
The requirement is to tell which index column each potential index qual
is meant to be used with.  I used a list-of-sublists representation, in
which each sublist corresponds to an index column, because that's what
group_clauses_by_indexkey has historically produced.  But on reflection
that seems like a leftover Lisp hack more than it does a natural choice.
The alternative that comes to mind is to use a flat list of quals and a
parallel integer list of column numbers.  Places that need to track what
goes with what would chase both lists in parallel using forboth(), while
places that don't care could just ignore the integer list.

This would end up with a net savings of palloc overhead because we could
get rid of the assorted calls to flatten_clausegroups_list and
flatten_indexorderbys_list that are currently needed by code that
doesn't want to think about column correspondences.  And it just seems
a bit more natural.

The only downside I can think of is that amcostestimate functions would
need two more arguments, bringing us to the point where we need an
OidFunctionCall11 (!), or else do some more drastic restructuring like
passing them an IndexPath explicitly.  But the existing commit has
already effectively changed their API, so this doesn't seem like a
showstopper.

Thoughts either way?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: bytea_agg

2011-12-23 Thread Alvaro Herrera

Excerpts from Pavel Stehule's message of vie dic 23 18:36:11 -0300 2011:
 Hello
 
 2011/12/23 Tom Lane t...@sss.pgh.pa.us:

  I generally agree with Peter: string_agg makes sense here.  The only
  real argument against it is Pavel's point that he didn't include a
  delimiter parameter, but that just begs the question why not.  It
  seems at least plausible that there would be use-cases for it.
 
 I don't know a real usage for bytea delimiter. Probably there is, but
 I expect so most often use case will be without delimiter.

Well, sometimes bytea is used to store character strings when the
encoding information is to be handled by the app instead of having
Postgres know it, for various reasons.  I haven't seen any of those
cases yet that would use string_add(bytea) but I wouldn't foreclose that
possibility.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Representation of index clause lists

2011-12-23 Thread Robert Haas
On Fri, Dec 23, 2011 at 7:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Thoughts either way?

OidFunctionCall11?  How about making a struct out of some or all of
those arguments and passing that?

What about using arrays rather than Lists?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Representation of index clause lists

2011-12-23 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Dec 23, 2011 at 7:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Thoughts either way?

 OidFunctionCall11?  How about making a struct out of some or all of
 those arguments and passing that?

Well, that was what I meant by the allusion to IndexPath --- I'm
inclined to pass root and indexpath as the only input arguments,
with the other inputs being found in the path struct.  (Hm, maybe
outer_rel would have to remain separate too.)  I guess we could
also think about merging the four output parameters into one output
struct, but I'm less excited about that since it would involve inventing
a struct that's not used otherwise.

 What about using arrays rather than Lists?

I think lists are more convenient to work with and iterate over, as long
as you only need iteration and not random access, which is the case here
AFAICS.  In particular, if we changed to arrays then we'd have to
reinvent assorted operations like list_union that are currently used on
these lists, and I don't see any bang for the buck there.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: explain analyze with 'rows' but not timing

2011-12-23 Thread Pavel Stehule
2011/12/23 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 2011/12/23 Tom Lane t...@sss.pgh.pa.us:
 Tomas Vondra t...@fuzzy.cz writes:
 The motivation for this patch was that collection timing data often
 causes performance issues and in some cases it's not needed. But is this
 true for row counts?

 Perhaps more to the point, is there a use case for collecting timing
 data without row counts?  I find it hard to visualize a valid reason.

 yes - a searching of bad prediction

 No, because timing alone proves nothing at all.  The machine could just
 have been overloaded.

sorry, I didn't understand to question.

Using only time is not practical

Regards

Pavel


                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers